Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/Model Training/Parallel Training/
AI Model Memory
Search

AI Model Memory

Creator
Creator
Seonglae Cho
Created
Created
2024 Mar 8 16:10
Editor
Editor
Seonglae Cho
Edited
Edited
2024 Mar 8 16:14
Refs
Refs
  • Model Parameters (weights + biases)
  • Model Gradients (same as parameters)
  • Optimizer States (momentum, variance …)
  • Others (
    KV Cache
    …)
 
 
notion image
https://www.union.ai/blog-post/fine-tune-llama-2-with-limited-resources
https://moon-walker.medium.com/large-model-학습의-game-changer-ms의-deepspeed-zero-1-2-3-그리고-zero-infinity-74c9640190de
https://moon-walker.medium.com/large-model-학습의-game-changer-ms의-deepspeed-zero-1-2-3-그리고-zero-infinity-74c9640190de
 
 
 
 
Large Model 학습의 game changer, MS의 DeepSpeed ZeRO-1,2,3 그리고 ZeRO-Infinity
DeepSpeed ZeRO는 Large Model 학습에 본격적으로 Heterogeneous Computing을 활용하여 Large Model 학습에 필요한 비용을 절감할 수 있다.
Large Model 학습의 game changer, MS의 DeepSpeed ZeRO-1,2,3 그리고 ZeRO-Infinity
https://moon-walker.medium.com/large-model-학습의-game-changer-ms의-deepspeed-zero-1-2-3-그리고-zero-infinity-74c9640190de
Large Model 학습의 game changer, MS의 DeepSpeed ZeRO-1,2,3 그리고 ZeRO-Infinity
 
 
 

Recommendations

Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/Model Training/Parallel Training/
AI Model Memory
Copyright Seonglae Cho