torch.distributed

Creator

Creator

Seonglae Cho

Created

Created

2023 Aug 19 17:23

Editor

Editor

Seonglae Cho

Edited

Edited

2025 Oct 15 15:38

Refs

Refs

torch.nn.parallel

torch.distributed members

torch.distributed.c10d

torch.distributed.fsdp

torch.distributed.elastic

NCCL → GPU communication (default, fastest)

Gloo → CPU and fallback communication (GPU possible but slower)

MPI → Communication via MPI process launcher (non-standard, rarely used)

Distributed communication package - torch.distributed — PyTorch 2.2 documentation

Please refer to PyTorch Distributed Overview for a brief introduction to all features related to distributed training.

https://pytorch.org/docs/stable/distributed.html

Distributed communication package - torch.distributed — PyTorch 2.0 documentation

https://pytorch.org/docs/stable/distributed.html

Backlinks

nanoGPT train.py torch.nn DDP GPT Pytorch

Recommendations

////////