Many-Shot In-Context Learning
Large language models (LLMs) excel at few-shot in-context learning (ICL) – learning from a few examples provided in context at inference, without any weight updates.
Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples – the many-shot regime.
Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative and discriminative tasks.
While promising, many-shot ICL can be bottlenecked by the available amount of human-generated outputs.
To mitigate this limitation, we explore two new settings: “Reinforced ICL” and “Unsupervised ICL”.
Reinforced ICL uses model-generated chain-of-thought rationales in place of human rationales.
Unsupervised ICL removes rationales from the prompt altogether, and prompts the model only with domain-specific inputs.
We find that both Reinforced and Unsupervised ICL can be quite effective in the many-shot regime, particularly on complex reasoning tasks. Finally, we demonstrate that, unlike few-shot learning, many-shot learning is effective at overriding pretraining biases and can learn high-dimensional functions with numerical inputs. Our analysis also reveals the limitations of next-token prediction loss as an indicator of downstream performance.
https://arxiv.org/html/2404.11018v1