PGFLMG
PGFLMG
some 8 * H20 accuracy for deepseek-v3, cc: @zhyncs ## Server ```bash python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code --mem-fraction-static 0.9 ``` ## gsmk8 ```bash python3 benchmark/gsm8k/bench_sglang.py --num-shots 8...
Test EP8 DeepSeek-V3 accuracy cc: @zhyncs @sleepcoo , for this pr : https://github.com/sgl-project/sglang/pull/3602 ## Device 8 * H200 ## Server ```bash python3 -m sglang.launch_server --model deepseek-ai/DeepSeek-V3 --tp 8 --trust-remote-code --enable-dp-attention...
I think it not supported yet, maybe you can refer: https://docs.sglang.ai/references/supported_models.html#how-to-support-a-new-vlm
@shimizust We are currently in the process of preparing the library for a public release, and publishing it to PyPI is a top priority for us. You can expect to...
> @FlamingoPg Seems triton is failing with this error https://github.com/sgl-project/sglang/actions/runs/19317390340/job/55254572458?pr=12969 > > We have met similar situations before. It's caused by https://github.com/triton-lang/triton/pull/8536/files, and we solved it by pinning the version...
Great work! May I ask how long a single tuning run takes now? Is there a switch to control whether the kernel is tuned?
> @FlamingoPg Could you review this PR? Thanks! Looks fine, let's wait for the CI
I will help you rerun failed jobs
> Hi @FlamingoPg, It seems that CI failures are not related to my PR. Could you help confirm? Thanks! Sure
> @FlamingoPg Could you check the remaining failures? thanks! Sure