gloo
                                
                                 gloo copied to clipboard
                                
                                    gloo copied to clipboard
                            
                            
                            
                        NCCL debug with benchmark_cuda
I'm running benchmark_cuda with MPI and am setting various NCCL environment variables on the command line. When I specify -x NCCL_DEBUG=INFO I don't see any debug info being dumped on the console. Any ideas?
Possibly answering my own question I think I know what is going on. gloo isn't using NCCL to create all the transport paths between nodes and instead just using the broadcast/allreduce kernel code from NCCL to run on the GPUs.