open-webui issue: Think tags not playing well with Native Tools enabled.

Check Existing Issues

[x] I have searched the existing issues and discussions.
[x] I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.5

Ollama Version (if applicable)

0.6.6

Operating System

Windows 11

Browser (if applicable)

No response

Confirmation

[x] I have read and followed all instructions in README.md.
[x] I am using the latest version of both Open WebUI and Ollama.
[x] I have included the browser console logs.
[x] I have included the Docker container logs.
[x] I have listed steps to reproduce the bug in detail.

Expected Behavior

Since the model's response has has a closing tag, it should be ending the think UI and proceed with the normal response.

this happens when there are native tools included(not even called)

Actual Behavior

It seems to get stuck, even though there is a tag there.

This happens when native tools are included(not even called, just included only). If there are no native tools attached, the thinking works just fine.

Steps to Reproduce

Download Qwen 3 8b from Ollama
Run it on OWUII
Attach a tool to it, set to native tool call.
The outputs the correct but seems like the UI OWUI is hanging.
this is in streaming mode. If it's in non streaming mode, everything displays. (of course the think block would not collapse and remain in message if so)

Logs & Screenshots

Additional Information

No response

Apr 29 '25 08:04 ivanwong1989

Same issue here, when calling Qwen3 MoE 30B-A3B trough RAGFlow API, the OpenWebUI just ignored and the thinking state last forever.

Apr 29 '25 13:04 AlexRice13

Same issue

Apr 29 '25 21:04 freezlite

Same here

Apr 30 '25 07:04 basirsedighi

looks like a bug, Qwen3 MoE 30B-A3B stays stuck in thinking phase forever to me as well, although the final response is generated eventually

May 01 '25 16:05 criscola

I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.

NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.

May 01 '25 22:05 bgeneto

@bgeneto i dont think is its is a ollama issue, when functioncalling is to default it works perfectly , but when you switch it to native you see this behavor

Latest ollama and openwebui on kubernetis

UPDATE: 2025-05-02 Seems like i am wrong it is acctully a Ollama problem, thanks to @tjbck for clearification

May 02 '25 07:05 basirsedighi

Seems to be caused by responses not being properly streamed from the Ollama-end. Investigating.

May 02 '25 09:05 tjbck

Related: https://github.com/ollama/ollama/issues/9632

@basirsedighi It is indeed Ollama issue (tool streaming issue), with that being said, we also just addressed this edge case from our end in dev branch.

6d81eef425b1a602a1b6933c58ff7848acd0b9af

May 02 '25 09:05 tjbck

Does empty content think tags also cause it to loop forever?

May 02 '25 10:05 ivanwong1989

I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.

NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.

Is possible it seems.

When i enable thinking for Qwen3, it tends to have BOTH thinking thoughts and response itself inside the thinking block, and forgets to enclose it ?

hmmm..

May 02 '25 10:05 ivanwong1989

I pulled dev docker image 5 minutes ago.

with no_think

with think

YEAYY!!!!! THANK YOU!!! 100 Gracias!

May 02 '25 10:05 ivanwong1989

Now <think> cut and you can't see the process of thinking, if task requires multiple function calls you just seeing some response after every function call. Also function call has loading icon if you update the page even if response already done.

It would be better to see thinking process because it spends a lot of time and you have no clue whats going on at that time.

May 03 '25 00:05 freezlite