有名氏

Results 2 comments of 有名氏

same error when train on multi nodes

Could you provide guidance on how to consolidate the weights of a module—specifically, ParallelMLP and Parallel Attention—into a PyTorch-compatible format? I am utilizing a tensor-parallel size greater than 1, which...