Li Hui
Li Hui
> @lambert0312 let me double check on both platform today. What chips you used ? A800 thanks @yiakwy-xpu-ml-framework-team
> @zhaochenyang20 I have revert commit to [6b08bf5](https://github.com/sgl-project/sglang/commit/6b08bf538bf3a7c69b710dc2c6f160d3f129008d). Once review is done, we could rebase onto main branch to resolve conflicts. > > Please let me do rebase merge later...
@yiakwy-xpu-ml-framework-team Thanks for the reply. I will try it again according to the steps tomorrow. Logically speaking, it will be built using your patch.
I start the service using the following command: ``` python3 -m sglang.launch_server --model-path /path/to/Qwen2.5-Coder-7B-Instruct --context-length 16384 --tp 1 --speculative-algorithm EAGLE --speculative-draft-model-path /path/to/EAGLE-Qwen2-7B-Instruct --mem-fraction-static 0.5 --cuda-graph-max-bs 8 --speculative-num-steps 5 --speculative-eagle-topk 8...
After testing, the error is as follows: ``` Scheduler hit an exception: Traceback (most recent call last): File "/sgl-workspace/sglang/python/sglang/srt/model_executor/cuda_graph_runner.py", line 314, in __init__ self.capture() File "/sgl-workspace/sglang/python/sglang/srt/model_executor/cuda_graph_runner.py", line 405, in capture...
> [lambert0312](https://github.com/lambert0312) The latest commit( #[5256543](https://github.com/sgl-project/sglang/pull/6081/commits/5256543d05493646d4faaa73606b1e9498ab6e2c) has fixed this bug. Thanks! @u4lr451 Great, it has been verified to work properly, but the speed is much slower than when dp-attention is...
@HandH1998 @laixinn Cannot support torch-compile? When I enable torch-compile, the returned result is garbled characters. Like this: ``` {"id":"2fe19ce57cdb4613bf5e1b718d21ae8b","object":"chat.completion","created":1740622831,"model":"ds3","choices":[{"index":0,"message":{"role":"assistant","content":"�-se-se goodπππ goodπ good goodππ goodπ goodπ goodππ good good goodπ good-seππ...
> @lambert0312 please provide detailed configuration about this result and try launch without torch-compile to ensure everything else is good. @laixinn I start the 2 node using the following command:...
> @lambert0312 Could you test the command again without using xgrammar and --disable-overlap? I ran the command without these options and it worked for me. @laixinn I tried again and...
Any progress on this?