tutel icon indicating copy to clipboard operation
tutel copied to clipboard

Examples integrated with Megatron-LM

Open learner321 opened this issue 6 months ago • 1 comments

Could you provide an example integrated with Megatron-Lm. Thanks :)

learner321 avatar Jun 20 '25 11:06 learner321

Hello, Megatron-LM already includes a non-dynamic component that supports several MoE functionalities. However, since Megatron's expert parameter placement is static, and coupled with a set of Megatron’s predefined static parallelism configuration, adding another MoE implementation into Megatron leads to breaking of those settings as well as parameter placement conflicts which are required by Tutel (e.g. switching Top-k / parallelism from time to time).

ghostplant avatar Jun 22 '25 00:06 ghostplant