Simo Lin
Simo Lin
seriously, wtf? who thought this was a good idea?
can you also output all of ur NCCL env var? and have you run NCCL benchmark?
any ETA? And would this support all family and multiple images?
/tag-and-rerun-ci
this is the same issue, we should force it to wait for pods to get ready
/tag-and-rerun-ci
would u mind fix the lint error? thanks
/tag-and-rerun-ci