有名氏 comments

Repositories
Issues
Comments

Results 2 comments of


                                            有名氏

nvmlDeviceGetHandleByPciBusId() failed with error #2

same error when train on multi nodes

Huggingface <-> Megatron-LM Compatibility

Could you provide guidance on how to consolidate the weights of a module—specifically, ParallelMLP and Parallel Attention—into a PyTorch-compatible format? I am utilizing a tensor-parallel size greater than 1, which...