Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/Model Training/Parallel Training/
Tensor Parallelism
Search

Tensor Parallelism

Creator
Creator
Seonglae Cho
Created
Created
2023 Nov 1 8:24
Editor
Editor
Seonglae Cho
Edited
Edited
2024 Mar 31 15:43
Refs
Refs
divide large matrix
  • re-write model code
  • combine column/row slicing to reduce sync points
 
 
 
 
 
Accelerating Generative AI with PyTorch II: GPT, Fast
This post is the second part of a multi-series blog focused on how to accelerate generative AI models with pure, native PyTorch. We are excited to share a breadth of newly released PyTorch performance features alongside practical examples to see how far we can push PyTorch native performance. In part one, we showed how to accelerate Segment Anything over 8x using only pure, native PyTorch. In this blog we’ll focus on LLM optimization.
Accelerating Generative AI with PyTorch II: GPT, Fast
https://pytorch.org/blog/accelerating-generative-ai-2/
Accelerating Generative AI with PyTorch II: GPT, Fast
 
 

Recommendations

Texonom
Texonom
/
Science
Science
/Mathematics/Math Field/Statistics/Statistical Model/Model Generalization/Model Training/Parallel Training/
Tensor Parallelism
Copyright Seonglae Cho