tongye98 issues

Repositories
Issues
Comments

Results 2 issues of


                                            tongye98

Multi-GPU training.

**Describe the bug** Not a bug, but a suggestion for enhancement. The current solution for multi-gpu training adopts `nn.DataParaller` in joeynmt-2.0. But the flaw of `nn.DataParallel` is obvious and Pytorch...

enhancement

help wanted

DeepSeek-Coder-V2-Lite模型Lora微调，显卡利用率不高

利用DeepSeek-Coder-V2-Lite-Base模型进行Lora微调，GPU利用率只能在40%左右，是不是因为是MOE架构的原因？