dify When calling qwen 3 using vlmm, add the enable

Self Checks

[x] I have searched for existing issues search for existing issues, including closed ones.
[x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[x] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:)
[x] Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

When importing with the vllm model, if it is a model of the qwen 3 series, enable_thinking can be added

2. Additional context or comments

No response

3. Can you help us with this feature?

[ ] I am interested in contributing to this feature.

Apr 29 '25 03:04 ZxnSnowy

Adding "/no_think" to system prompt can disable the process of thinking.

Apr 29 '25 07:04 yankai-victor

Adding "/no_think" to system prompt can disable the process of thinking.

But it will still return the < think > tag.

Apr 29 '25 08:04 qqzp63168

@crazywoola Would you consider adding a "DeepThink" button to the page?

May 01 '25 03:05 aki-1995

Adding "/no_think" to system prompt can disable the process of thinking.

But it will still return the < think > tag.

I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.

May 17 '25 02:05 SimonHu1993

Adding "/no_think" to system prompt can disable the process of thinking.

But it will still return the < think > tag.

I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.

Adding /no_think indeed prevents the generation of thought content, but the think tag is still retained. At the moment, I have to add a code block to filter out the string, which just feels a bit inelegant. Of course, it's not that important.

May 19 '25 09:05 qqzp63168

Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags

May 22 '25 01:05 Anthonychen1994

Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags

when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.

Jun 02 '25 01:06 kolagood

Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags

when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.

Here is my example, which can effectively remove the think tag： curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'

Jun 04 '25 15:06 Anthonychen1994

Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags

when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.

Here is my example, which can effectively remove the think tag： curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'

thanks，when I use normal call model's api, add this param can prevent tag. But I use dify's api, it does not take effect.

Jun 07 '25 08:06 kolagood

Qwen3 needs to add a setting parameter "chat_template_kwargs": {"enable_thinking": false} ". Setting this parameter will no longer generatetags

when I use dify(0.15.3) version, it is not actived when I set "chat_template_kwargs" using API. It still has tag.

Here is my example, which can effectively remove the think tag： curl http://ip:16434/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "qwen3-32b", "messages": [ {"role": "user", "content": "What are the famous foods in Hangzhou"} ], "temperature": 0.7, "top_p": 0.8, "top_k": 20, "max_tokens": 8192, "presence_penalty": 1.5, "chat_template_kwargs": {"enable_thinking": false} }'

thanks，when I use normal call model's api, add this param can prevent tag. But I use dify's api, it does not take effect.

You should use qwen3 reasoning_parser, rather than deepseek_r1 reasoning_parser when starting vllm. --reasoning-parser qwen3

Aug 08 '25 07:08 Yanhuanjin

我也遇到了这个问题。上面大佬的方法解决了 --reasoning-parser qwen3

Aug 14 '25 07:08 jack-zhuhua

Adding "/no_think" to system prompt can disable the process of thinking.

But it will still return the < think > tag.

I've tried it and it's no problem. You need to add "/no_think" to the last line of the system, with a blank line before it.

Adding /no_think indeed prevents the generation of thought content, but the think tag is still retained. At the moment, I have to add a code block to filter out the string, which just feels a bit inelegant. Of course, it's not that important.

代码可否看下，可以在流式响应里替换掉嘛

Aug 15 '25 08:08 jiabangyao

When calling qwen 3 using vlmm, add the enable_thinking field

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

2. Additional context or comments

3. Can you help us with this feature?