gloo icon indicating copy to clipboard operation
gloo copied to clipboard

NCCL debug with benchmark_cuda

Open insanum opened this issue 3 years ago • 1 comments

I'm running benchmark_cuda with MPI and am setting various NCCL environment variables on the command line. When I specify -x NCCL_DEBUG=INFO I don't see any debug info being dumped on the console. Any ideas?

insanum avatar Feb 09 '22 18:02 insanum

Possibly answering my own question I think I know what is going on. gloo isn't using NCCL to create all the transport paths between nodes and instead just using the broadcast/allreduce kernel code from NCCL to run on the GPUs.

insanum avatar Feb 09 '22 20:02 insanum