liangshaopeng
Results
2
issues of
liangshaopeng
helper.cpp
**Is your feature request related to a problem? Please describe.** I have seen support for training MOE models in Megatron, including scripts for the Mixtral 8x7B model, at: https://docs.nvidia.com/megatron-core/developer-guide/latest/api-guide/moe.html. However,...
stale