In-context learning

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 11 11:13
Editor
Edited
Edited
2024 Jun 23 3:4

Emergent ability
of
Attention Mechanism
since of
Induction head

In modern language models, tokens later in the context are easier to predict than tokens earlier in the context. As the context gets longer, loss goes down. In some sense this is just what a sequence model is designed to do (use earlier elements in the sequence to predict later ones), but as the ability to predict later tokens from earlier ones gets better, it can increasingly be used in interesting ways (such as specifying tasks, giving instructions, or asking the model to match a pattern) that suggest it can usefully be thought of as a phenomenon of its own. When thought of in this way, it is usually referred to as in-context learning.
Emergent in-context learning was noted in GPT-2 and gained significant attention in GPT-3. Simply by adjusting a “prompt”, transformers can be adapted to do many useful things without re-training, such as translation, question-answering, arithmetic, and many other tasks. Using “prompt engineering” to leverage in-context learning became a popular topic of study and discussion. -
Anthropic AI
In-context learning Notion
 
 
 
 

Open AI paper

Few-shot PEFT is cost efficient for specific task than in-context learning

Overall Korean

 
 

Recommendations