NVIDIA Collective Communication Library
Windows are not supported
Cross-GPU tensor communication
Almost every CUDA based multi GPU training/inference server use NCCL
NCCL_SOCKET_IFNAME
NCCL_DEBUG
Using torch
python -c "import torch;print(torch.cuda.nccl.version())"
Install
wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb dpkg -i cuda-keyring_1.0-1_all.deb apt update apt install libnccl2=2.18.5-1+cuda12.1 libnccl-dev=2.18.5-1+cuda12.1