Huapeng Zhou
Huapeng Zhou
I will take that together!
Fixed by PR: https://github.com/sgl-project/sglang/pull/4718
> Hi @quinnrong94 , can you take a look at this CI fail? https://github.com/sgl-project/sglang/actions/runs/14996032913/job/42130798605?pr=6109 Hi @Fridge003 , I saw flashMLA test failed in CI, I wonder if it's due to...
> Also please provide the performance benchmark after this enhancement Yes, there is another guy who is testing the performance!
Still working bro > > > Also please provide the performance benchmark after this enhancement > > > > > > Yes, there is another guy who is testing the...
Here is my benchmark for testing(test on H100): command: python3 -m sglang.bench_one_batch --model-path meta-llama/Llama-3.1-8B-Instruct --attention-backend fa3 --batch 16 --input-len 1024 --output-len 10 Before this PR: After: Thanks @Fridge003 for helping!
Hi @zhyncs, can you help to review this PR? I think it is ready to merge
> In the mean time, I think according to our current docs: > > ``` > pip install --upgrade pip > pip install sgl-kernel --force-reinstall --no-deps > pip install "sglang[all]>=0.4.3.post2"...
I will help to look at this
Still work in progress:(