LoRA

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 22 15:24
Editor
Edited
Edited
2024 Nov 22 21:38

Low-Rank Adaptation

LoRA is a technique that optimizes rank decomposition matrices with LoRA module rank .
The existing PEFT technology reduces the available sequence length of the model or expands the model depth for inference. LoRA usually applied to MLP by two low-rank matrix before activation.
Free weight and append adaptation layer composed by adaptation matrix with proving low-rank adaptation matrix is sufficient for fine-tuning. By dividing adaptation matrix like
Bottleneck layer
, It makes total model size smaller and make it achieve same learning rate with less computing resources.
A learning rate of 1e-4 has become the standard when fine-tuning LLMs with LoRA. Although we occasionally encountered training loss instabilities, reducing the learning rate to lower values like 3e-5. LoRA’s weight is initialized randomly but techniques like
EVA
uses activation vector decomposing it to initialize based on priority.
LoRA Usages
 
 
 
 
 
 
LoRA architectures generate distinct high-magnitude eigenvalues, known as intrusive components, that don't exist in conventional fine-tuning. These components influence how the model generalizes.

Model Regularization

LoRA Learns Less and Forgets Less

It is enough to change
Cross-Attention
layer for fine tuning

Tips

 
 
 
 

 

Recommendations