Parallel Training

Creator
Creator
Alan JoAlan Jo
Created
Created
2022 Mar 15 11:35
Editor
Editor
Alan JoAlan Jo
Edited
Edited
2024 Mar 31 14:55

Multi-GPU or multi-node Distributed Training

Data parallelism or model parallelism

  • In data parallelism, the data is split into multiple parts
  • in model parallelism, different parts of the model are processed by separate processors

These parallelism are states as 4D parallelism or 3D parallelism

Parallel Training Notion
 
 
 
 
https://xiandong79.github.io
 
 
 
 

Recommendations