NCCL

Creator
Creator
Seonglae Cho
Created
Created
2023 May 30 8:10
Editor
Edited
Edited
2024 Jul 7 2:5
Refs

NVIDIA Collective Communication Library

Windows are not supported

Cross-GPU tensor communication

Almost every CUDA based multi GPU training/inference server use NCCL
  • NCCL_SOCKET_IFNAME
  • NCCL_DEBUG

Using torch

python -c "import torch;print(torch.cuda.nccl.version())"

Install

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb dpkg -i cuda-keyring_1.0-1_all.deb apt update apt install libnccl2=2.18.5-1+cuda12.1 libnccl-dev=2.18.5-1+cuda12.1
 
 

Install

Env

Timeout

 
 

Recommendations