mergoo icon indicating copy to clipboard operation
mergoo copied to clipboard

[Feature] Support New Arguments for Expert Routing Policies.

Open jacklanda opened this issue 8 months ago • 9 comments

Hi there, thanks mergoo, an amazing code base for MoE model construction.

A crucial feature that may need to be implemented is that mergoo should let the user select the basic routing policy when constructing the MoE layer.

Specifically, I think the forward method shown here should be concerned with refactoring to adapt the policy selection (an argument passed by the user). As far as I know, the current code will construct a fully-activated MoE model, not a real sparse MoE model.

I am delighted to share my code for this feature and file a PR for it 🤗.

Would you have any thoughts to share about it?

jacklanda avatar May 31 '24 10:05 jacklanda