YaRN

Creator

Creator

Seonglae Cho

Created

Created

2023 Sep 16 8:45

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Jul 8 15:2

Refs

Refs

Yet another RoPE extensioN

The reparametrization of RoPE as a set of 2D matrices has a clear benefit on the implementation of this attention scaling: we can instead use a “length scaling” trick which scales both qm and kn by a constant factor p 1/t by simply scaling the complex RoPE embeddings by the same amount.

Results

YaRN isn’t just good at making sense of longer sentences during fine-tuning, it can also understand things beyond what it learned from the limited context data during fine-tuning.

Dynamic-YaRN, combined with Dynamic Scaling at inference time, allows for more than 2x context window extension without any fine-tuning.

YaRN allows efficient extrapolation with finetuning on shorter datasets and can take advantage of transfer learning for faster convergence.

YaRN: Efficient Context Window Extension of Large Language Models

Rotary Position Embeddings (RoPE) have been shown to effectively encode positional information in transformer-based language models. However, these models fail to generalize past the sequence...

https://arxiv.org/abs/2309.00071

Paper page - YaRN: Efficient Context Window Extension of Large Language Models

Join the discussion on this paper page

Paper page - YaRN: Efficient Context Window Extension of Large Language Models

https://huggingface.co/papers/2309.00071

Paper page - YaRN: Efficient Context Window Extension of Large Language Models

NousResearch/Yarn-Llama-2-13b-128k · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

NousResearch/Yarn-Llama-2-13b-128k · Hugging Face

https://huggingface.co/NousResearch/Yarn-Llama-2-13b-128k

NousResearch/Yarn-Llama-2-13b-128k · Hugging Face

Understanding YaRN: Extending Context Window of LLMs

YaRN: Yet another RoPE extensioN method

Understanding YaRN: Extending Context Window of LLMs

https://medium.com/@rcrajatchawla/understanding-yarn-extending-context-window-of-llms-3f21e3522465

Understanding YaRN: Extending Context Window of LLMs

Backlinks

LM Context Extending

Recommendations

//////////