LitingLin
Results
2
comments of
LitingLin
我认为任由撞墙、撞了再说是一种危险的行为,还是尽可能收集gfwlist比较稳妥
We did not meet such a problem before. It looks like one node called torch.distributed.all_reduce while other nodes just run away.