open-webui icon indicating copy to clipboard operation
open-webui copied to clipboard

issue: Think tags not playing well with Native Tools enabled.

Open ivanwong1989 opened this issue 7 months ago • 12 comments

Check Existing Issues

  • [x] I have searched the existing issues and discussions.
  • [x] I am using the latest version of Open WebUI.

Installation Method

Docker

Open WebUI Version

0.6.5

Ollama Version (if applicable)

0.6.6

Operating System

Windows 11

Browser (if applicable)

No response

Confirmation

  • [x] I have read and followed all instructions in README.md.
  • [x] I am using the latest version of both Open WebUI and Ollama.
  • [x] I have included the browser console logs.
  • [x] I have included the Docker container logs.
  • [x] I have listed steps to reproduce the bug in detail.

Expected Behavior

Since the model's response has has a closing tag, it should be ending the think UI and proceed with the normal response.

this happens when there are native tools included(not even called)

Actual Behavior

It seems to get stuck, even though there is a tag there.

Image

This happens when native tools are included(not even called, just included only). If there are no native tools attached, the thinking works just fine.

Steps to Reproduce

  1. Download Qwen 3 8b from Ollama
  2. Run it on OWUII
  3. Attach a tool to it, set to native tool call.
  4. The outputs the correct but seems like the UI OWUI is hanging.
  5. this is in streaming mode. If it's in non streaming mode, everything displays. (of course the think block would not collapse and remain in message if so)

Logs & Screenshots

Image

Additional Information

No response

ivanwong1989 avatar Apr 29 '25 08:04 ivanwong1989

Same issue here, when calling Qwen3 MoE 30B-A3B trough RAGFlow API, the OpenWebUI just ignored and the thinking state last forever.

AlexRice13 avatar Apr 29 '25 13:04 AlexRice13

Same issue

freezlite avatar Apr 29 '25 21:04 freezlite

Same here

Image

basirsedighi avatar Apr 30 '25 07:04 basirsedighi

looks like a bug, Qwen3 MoE 30B-A3B stays stuck in thinking phase forever to me as well, although the final response is generated eventually

criscola avatar May 01 '25 16:05 criscola

I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.

NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.

bgeneto avatar May 01 '25 22:05 bgeneto

@bgeneto i dont think is its is a ollama issue, when functioncalling is to default it works perfectly , but when you switch it to native you see this behavor

Latest ollama and openwebui on kubernetis

Image

UPDATE: 2025-05-02 Seems like i am wrong it is acctully a Ollama problem, thanks to @tjbck for clearification

basirsedighi avatar May 02 '25 07:05 basirsedighi

Seems to be caused by responses not being properly streamed from the Ollama-end. Investigating.

tjbck avatar May 02 '25 09:05 tjbck

Related: https://github.com/ollama/ollama/issues/9632

@basirsedighi It is indeed Ollama issue (tool streaming issue), with that being said, we also just addressed this edge case from our end in dev branch.

6d81eef425b1a602a1b6933c58ff7848acd0b9af

tjbck avatar May 02 '25 09:05 tjbck

Does empty content think tags also cause it to loop forever?

Image

ivanwong1989 avatar May 02 '25 10:05 ivanwong1989

I think this is more of a Qwen3 model family issue (or how it integrates with Ollama) rather than a problem with Open-WebUI. Maybe using a better chat_template in Ollama could resolve it (I’ve tried, but it didn’t work). The model isn’t thinking forever (at least not for me), but the real issue is that it always puts answers inside thinking tags when tools are used.

NOTE: this issue title is misleading because it only happens with Qwen3 models (as far as I know) not with every model while using native tools.

Is possible it seems.

When i enable thinking for Qwen3, it tends to have BOTH thinking thoughts and response itself inside the thinking block, and forgets to enclose it ?

hmmm..

ivanwong1989 avatar May 02 '25 10:05 ivanwong1989

I pulled dev docker image 5 minutes ago.

with no_think Image

with think Image

YEAYY!!!!! THANK YOU!!! 100 Gracias!

ivanwong1989 avatar May 02 '25 10:05 ivanwong1989

Now <think> cut and you can't see the process of thinking, if task requires multiple function calls you just seeing some response after every function call. Also function call has loading icon if you update the page even if response already done.

It would be better to see thinking process because it spends a lot of time and you have no clue whats going on at that time.

freezlite avatar May 03 '25 00:05 freezlite