When calling qwen 3 using vlmm, add the enable_thinking field
Self Checks
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
1. Is this request related to a challenge you're experiencing? Tell me about your story.
When importing with the vllm model, if it is a model of the qwen 3 series, enable_thinking can be added
2. Additional context or comments
No response
3. Can you help us with this feature?
- [ ] I am interested in contributing to this feature.
Adding "/no_think" to system prompt can disable the process of thinking.
Adding "/no_think" to system prompt can disable the process of thinking.
But it will still return the < think > tag.
@crazywoola Would you consider adding a "DeepThink" button to the page?
Adding "/no_think" to system prompt can disable the process of thinking.
But it will still return the < think > tag.
I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.
Adding "/no_think" to system prompt can disable the process of thinking.
But it will still return the < think > tag.
I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.
Adding /no_think indeed prevents the generation of thought content, but the think tag is still retained. At the moment, I have to add a code block to filter out the string, which just feels a bit inelegant. Of course, it's not that important.
Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generate
Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags
when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has
Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags
when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.
![]()
Here is my example, which can effectively remove the think tag: curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'
Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags
when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.
Here is my example, which can effectively remove the think tag: curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'
thanks,when I use normal call model's api, add this param can prevent
Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags
when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.
Here is my example, which can effectively remove the think tag: curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'
thanks,when I use normal call model's api, add this param can prevent tag. But I use dify's api, it does not take effect.
You should use qwen3 reasoning_parser, rather than deepseek_r1 reasoning_parser when starting vllm.
--reasoning-parser qwen3
我也遇到了这个问题。上面大佬的方法解决了 --reasoning-parser qwen3
Adding "/no_think" to system prompt can disable the process of thinking.
But it will still return the < think > tag.
I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.
Adding /no_think indeed prevents the generation of thought content, but the think tag is still retained. At the moment, I have to add a code block to filter out the string, which just feels a bit inelegant. Of course, it's not that important.
代码可否看下,可以在流式响应里替换掉嘛