有名氏
Results
2
comments of
有名氏
same error when train on multi nodes
Could you provide guidance on how to consolidate the weights of a module—specifically, ParallelMLP and Parallel Attention—into a PyTorch-compatible format? I am utilizing a tensor-parallel size greater than 1, which...