PEFT로 LoRA Checkpoint 로드시 size mismatch 해결법
base_model.model.gpt_neox.layers.0.attention.query_key_value.lora_A.weight: copying a param with shape torch.Size([16, 5120]) from checkpoint, the shape in current model is torch.Size([8, 5120]) 와 같은 문제를 해결하기
https://junbuml.ee/lora-ckpt-size-mismatch