PGFLMG

Results 61 comments of PGFLMG

some 8 * H20 accuracy for deepseek-v3, cc: @zhyncs ## Server ```bash python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code --mem-fraction-static 0.9 ``` ## gsmk8 ```bash python3 benchmark/gsm8k/bench_sglang.py --num-shots 8...

Test EP8 DeepSeek-V3 accuracy cc: @zhyncs @sleepcoo , for this pr : https://github.com/sgl-project/sglang/pull/3602 ## Device 8 * H200 ## Server ```bash python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code --enable-dp-attention...

I think it not supported yet, maybe you can refer: https://docs.sglang.ai/references/supported_models.html#how-to-support-a-new-vlm

@shimizust We are currently in the process of preparing the library for a public release, and publishing it to PyPI is a top priority for us. You can expect to...

> @FlamingoPg Seems triton is failing with this error https://github.com/sgl-project/sglang/actions/runs/19317390340/job/55254572458?pr=12969 > > We have met similar situations before. It's caused by https://github.com/triton-lang/triton/pull/8536/files, and we solved it by pinning the version...

Great work! May I ask how long a single tuning run takes now? Is there a switch to control whether the kernel is tuned?

> @FlamingoPg Could you review this PR? Thanks! Looks fine, let's wait for the CI

I will help you rerun failed jobs

> Hi @FlamingoPg, It seems that CI failures are not related to my PR. Could you help confirm? Thanks! Sure

> @FlamingoPg Could you check the remaining failures? thanks! Sure