QLoRA

Creator
Creator
Seonglae ChoSeonglae Cho
Created
Created
2023 Jun 22 15:19
Editor
Edited
Edited
2024 Mar 31 15:17

LoRA +
Model Quantization

4-bit Normalized FP + Double Quantized + Paged Optimizer = Memory Optimization
  1. Store LoRA weight as quantized
    Normalized Floating Point
    4 bit form NF4
  1. Dequantize 4 bit value to BF16 during forward or backward
QLoRA Usages
 
 
 

Paper with parameter values

Korean

 
 

Recommendations