Baizhou Zhang

Results 79 comments of Baizhou Zhang

cc @fzyzcjy @kaixih @fy1214 Please have a look

Thanks, this seems to be a good idea!

> > Hello [@FrankLeeeee](https://github.com/FrankLeeeee) , would you please take a look the PR [#3680](https://github.com/sgl-project/sglang/pull/3680)? Appreciate that. Additionally, we have one concern: As we previously ran DeepSeek-R1 on SGLang and confirmed...

@teadross can you pull the latest main branch and try again? This bug seems to be solved according to #3836

@nvcastet I tried tp4+allreduce fusion+symm memory on Dpsk-fp4, but it was compatible Is there any condition of triggering this incompatibility

@nvcastet Sure, can you open a PR that changes the server args to trtllm allreduce fusion?

Thanks @YAMY1234~ If your PR is blocked on FlashMLA side, you can create a new branch at https://github.com/sgl-project/FlashMLA. The flashmla kernel now integrated in sglang are built on this repo

@YAMY1234 Can you add a benchmark for bs=1? Expectedly pure TP should be faster than DP+TP

> > @YAMY1234 Can you add a benchmark for bs=1? Expectedly pure TP should be faster than DP+TP > > @Fridge003 Added in the PR description~ Oh I mean performance...