Guillaume Jeanneret

Results 2 comments of Guillaume Jeanneret

Hi, I had the same problem while running the training code. It seems that there is a deadlock with NCCL 2.7.8 [(check here)](https://github.com/pytorch/pytorch/issues/47885). Try using `export NCCL_P2P_DISABLE=1` before using it,...

Hi, I am also interested to know when the training code will be released. Thanks!