DeepSpeed
DeepSpeed copied to clipboard
[REQUEST] Mixture of Experts (MoE) Segmentation Task
Feature relates to MoE End-to-End inference i would to know if MoE used in DeepSpeed can implement in the Segmentation task
Describe the solution i was working on the MoE problem I read the paper on DeepSpeed-MoE i read the documentation and I found only work on the MLP Linear Gate Network,
Feature make MoE in DeepSpeed support Sgemnatattion