chenxiaobing
Results
2
issues of
chenxiaobing
Fix the logic in grouped topk computation of fused moe. There is a slight difference between vllm grouped_topk and the official code. When the newly-introduced bias term (e_score_correction_bias in vllm)...
## Motivation In the implementation of grouped topk in MoE layer, the scores of masked groups are set to 0, which may leads to select incorrect experts in certain scenarios....