vllm
vllm copied to clipboard
[Bug]: Qwen3's answer was wrongly placed in `reasoning_content`
Your current environment
VLLM version: 0.85
🐛 Describe the bug
The command to serve Qwen3-32b:
VLLM_USE_V1=0 vllm serve Qwen/Qwen3-32B --served-model-name qwen3-32b -tp 4 --trust-remote-code --enable-reasoning --reasoning-parser deepseek_r1
Here I use VLLM_USE_V1=0 to enable guided output.
The query command:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen3-32b",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "你好"}
],
"chat_template_kwargs": {"enable_thinking": false}
}'
Here, I want to disable thinking temporarily.
And I got:
...
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": null,
"role": "assistant",
"tool_calls": null,
"function_call": null,
"refusal": null,
"reasoning_content": "你好!有什么我可以帮你的吗?"
}
}
],
...
I think the reply should be in the content field instead of the reasoning_content field.
Before submitting a new issue...
- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.