wangyao

Results 1 issues of wangyao

Hi, I'm trying to understand how the Gate module works in Tutel's MoE implementation. Since each rank only maintains a subset of experts (num_experts_per_device), but the Gate output seems to...