In-context learning

From llmref.wiki
In-context learning — A model's ability to perform a new task from examples or instructions placed in the prompt, without any update to its weights.

Overview

In-context learning (ICL) is the ability of a language model to adapt its behavior to a new task using only examples or instructions included in the prompt at inference time — without gradient updates to the model's parameters. The model observes a pattern in the context and continues it, generalizing from the in-context examples rather than from prior training.

ICL was named and studied systematically in the GPT-3 paper (Brown et al., 2020), which showed that large models could solve tasks not explicitly in their training data by including a few worked examples in the prompt.[1]

The mechanism underlying ICL remains an open research question: proposed explanations include in-context Bayesian inference, implicit gradient descent over the context, and retrieval of similar training patterns.

Variants by number of examples

Variant Examples in prompt Notes
Zero-shot 0 — task described in instructions only Relies entirely on pre-training generalization
One-shot 1 Single example to demonstrate format and task
Few-shot 2–~20 Standard ICL; diminishing returns beyond ~10 examples for most tasks
Many-shot Hundreds–thousands Enabled by large context windows; useful for complex classification

Zero-shot prompting is a degenerate case of ICL in which no examples are provided; the model generalizes from the task description alone.

Distinction from fine-tuning

Dimension In-context learning Fine-tuning
Weight update No Yes
Persistence Single call only Permanent (until re-trained)
Cost Inference tokens only Compute + storage for training run
Flexibility Any task, any call Task-specific after training
Limit Context window Training data volume

ICL is preferred when tasks vary per call or labeled data is scarce. Fine-tuning is preferred when behavior must be stable across calls or when the task requires knowledge not well-captured by prompting.

Relationship to chain-of-thought

Chain-of-thought prompting is ICL in which examples include explicit intermediate reasoning steps, not just input-output pairs. The model then generates intermediate reasoning before answering.

See also

References

  1. Brown, Tom et al. "Language Models are Few-Shot Learners." NeurIPS 2020. https://arxiv.org/abs/2005.14165