issue: Think tags not playing well with Native Tools enabled.
Check Existing Issues
- [x] I have searched the existing issues and discussions.
- [x] I am using the latest version of Open WebUI.
Installation Method
Docker
Open WebUI Version
0.6.5
Ollama Version (if applicable)
0.6.6
Operating System
Windows 11
Browser (if applicable)
No response
Confirmation
- [x] I have read and followed all instructions in
README.md. - [x] I am using the latest version of both Open WebUI and Ollama.
- [x] I have included the browser console logs.
- [x] I have included the Docker container logs.
- [x] I have listed steps to reproduce the bug in detail.
Expected Behavior
Since the model's response has has a closing tag, it should be ending the think UI and proceed with the normal response.
this happens when there are native tools included(not even called)
Actual Behavior
It seems to get stuck, even though there is a tag there.
This happens when native tools are included(not even called, just included only). If there are no native tools attached, the thinking works just fine.
Steps to Reproduce
- Download Qwen 3 8b from Ollama
- Run it on OWUII
- Attach a tool to it, set to native tool call.
- The outputs the correct
but seems like the UI OWUI is hanging. - this is in streaming mode. If it's in non streaming mode, everything displays. (of course the think block would not collapse and remain in message if so)
Logs & Screenshots
Additional Information
No response
Same issue here, when calling Qwen3 MoE 30B-A3B trough RAGFlow API, the OpenWebUI just ignored and the thinking state last forever.
Same issue
Same here
looks like a bug, Qwen3 MoE 30B-A3B stays stuck in thinking phase forever to me as well, although the final response is generated eventually
I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.
NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.
@bgeneto i dont think is its is a ollama issue, when functioncalling is to default it works perfectly , but when you switch it to native you see this behavor
Latest ollama and openwebui on kubernetis
UPDATE: 2025-05-02 Seems like i am wrong it is acctully a Ollama problem, thanks to @tjbck for clearification
Seems to be caused by responses not being properly streamed from the Ollama-end. Investigating.
Related: https://github.com/ollama/ollama/issues/9632
@basirsedighi It is indeed Ollama issue (tool streaming issue), with that being said, we also just addressed this edge case from our end in dev branch.
6d81eef425b1a602a1b6933c58ff7848acd0b9af
Does empty content think tags also cause it to loop forever?
I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.
NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.
Is possible it seems.
When i enable thinking for Qwen3, it tends to have BOTH thinking thoughts and response itself inside the thinking block, and forgets to enclose it ?
hmmm..
I pulled dev docker image 5 minutes ago.
with no_think
with think
YEAYY!!!!! THANK YOU!!! 100 Gracias!
Now <think> cut and you can't see the process of thinking, if task requires multiple function calls you just seeing some response after every function call. Also function call has loading icon if you update the page even if response already done.
It would be better to see thinking process because it spends a lot of time and you have no clue whats going on at that time.