bchandarr
Results
2
comments of
bchandarr
I am also getting same TCP timeout error vllm: 0.82.0 ray: 2.43.0 setup with 2 h100(2gpus) Tried flags NCCL_P2P_DISABLE=1, NCCL_NVLS_ENABLE=0, and --disable-custom-all-reduce, didn't work.
Tried setting `NCCL_SOCKER_IFNAME`, but din't work. Running in Azure k8s.