IB, NVIDIA Quantum-2 InfiniBan Stuck Process and SIGTERM Signal Interruption During Training with accelerate launchUpdated 2024 Feb 3 16:48NCCL Environment Variables — NCCL 2.20.3 documentationNCCL has an extensive set of environment variables to tune for specific usage.https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.htmlSwitchNVIDIA InfiniBand SwitchesNVIDIA InfiniBand switches deliver high performance and port density at speeds of 40/56/100/200Gb/s for HPC, AI, Web 2.0, big data, clouds, and enterprise data centers.https://www.nvidia.com/en-gb/networking/infiniband-switching/