[BugFix] Illegal memory access for MoE On H20
When we attempted to deploy DeepSeek R1 671B on two 8-card H20 machines, vLLM crashed and reported illegal memory access whenever the prompt length exceeded 32K. This PR fixes the bug.
And I found that the implementation of SGLang is similar to that of vLLM, so I made the changes together.
cc @zhyncs
@Abatom Thanks for your PR! I tried the fix with https://github.com/sgl-project/sglang/issues/3333 Unfortunately the Memory access fault persists. Would you please also confirm? Thanks.
#3679 made the same fix 😂
@merrymercy, Hi the same change([BugFix] Illegal memory access for MoE On H20 #13693) have already been merged in vLLM.
It has been merged. I added a co-author for you. Thank you.
ref https://github.com/sgl-project/sglang/pull/3679