ao
ao copied to clipboard
newer torchao breaks sglang?
From @merrymercy: https://github.com/sgl-project/sglang/actions/runs/15090376354/job/42418045676#step:4:831
might be breaking in both 0.10.0 and 0.11.0
Do you have more context @merrymercy, I just ran sglang w/
drisspg/stack/59 *1 ❯ python3 -m sglang.launch_server
--model-path meta-llama/Meta-Llama-3.1-8B-Instruct \
--torchao-config int4wo-128 \
--port 40000 --host 0.0.0.0
And I am getting
{"id":"61e6abbcab994069aa4fd401019b9152","object":"chat.completion","created":1748578410,"model":"qwen/qwen2.5-0.5b-instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The capital of France is Paris.","reasoning_content":null,"tool_calls":null},"logprobs":null,"finish_reason":"stop","matched_stop":128009}],"usage":{"prompt_tokens":42,"total_tokens":50,"completion_tokens":8,"prompt_tokens_details":null}}%
Can you point to a specific test?
From the looks of it though, this will likely solve: https://github.com/pytorch/ao/pull/2277