mergekit icon indicating copy to clipboard operation
mergekit copied to clipboard

Support for fine-grained experts in MoE models

Open misdelivery opened this issue 7 months ago • 0 comments

Are there any plans to support fine-grained experts in the future?

Fine-grained experts is a technique adopted in projects like Qwen MoE and DeepSeek MoE, and has shown promising results. This approach involves partitioning a single FFN into several segments to create multiple experts, allowing for a larger number of experts without increasing the overall parameter count. Qwen MoE DeepSeek MoE

misdelivery avatar Jul 05 '24 21:07 misdelivery