LightRAG [Question]:Why do queries through LightRag Ui and through OpenWebUI behave very differently ?

Do you need to ask a question?

[x] I have searched the existing question and discussions and this question is not already answered.
[ ] I believe this is a legitimate question, not just a bug or feature request.

Your Question

Queries through LightRag Ui and through OpenWebUI behave very differently

Additional Context

I have been trying to use LightRag with OpenWebUi and I have observed a surprising behaviour. When interacting with my knowledge graph through the UI it passes around 8000 tokens to the LLM (see the first logs) for the query. However, when interacting with it through Open Web Ui it passes 40 000+ tokens to the model (see the second logs) with the exact same question. I know there can be variability between 2 indetical queries but this behaviour is repeatable, every time I query through the UI it passes much less tokens than through OpenWebUI.

This is a bit of a problem because the large amont of tokens makes the prompt processing very long on my local machine.

Logs for query through the LightRag UI:


		},
		{
			"role": "user",
			"content": "comment est ce que je peux mettre a jour une paye de janvier ?"
		}
	],
	"model": "mlx-community/Qwen3-30B-A3B-4bit-DWQ-0508",
	"stream": true,
	"temperature": 0.5
}
2025-05-29 08:17:11,239 - DEBUG - https://huggingface.co:443 "GET /api/models/mlx-community/Qwen3-30B-A3B-4bit-DWQ-0508/revision/main HTTP/1.1" 200 6220
Fetching 12 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 54767.84it/s]
127.0.0.1 - - [29/May/2025 08:17:12] "POST /v1/chat/completions HTTP/1.1" 200 -
2025-05-29 08:17:12,716 - DEBUG - Starting stream:
2025-05-29 08:17:12,716 - DEBUG - *** Resetting cache. ***
2025-05-29 08:17:12,716 - DEBUG - Returning 8960 tokens for processing.

Logs for query Through OpenWebUi:

		},
		{
			"role": "user",
			"content": "comment est ce que je peux mettre a jour une paye de janvier ?"
		}
	],
	"model": "mlx-community/Qwen3-30B-A3B-4bit-DWQ-0508",
	"stream": true,
	"temperature": 0.5
}
2025-05-29 08:11:50,887 - DEBUG - https://huggingface.co:443 "GET /api/models/mlx-community/Qwen3-30B-A3B-4bit-DWQ-0508/revision/main HTTP/1.1" 200 6220
Fetching 12 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:00<00:00, 43996.20it/s]
127.0.0.1 - - [29/May/2025 08:11:52] "POST /v1/chat/completions HTTP/1.1" 200 -
2025-05-29 08:11:52,612 - DEBUG - Starting stream:
2025-05-29 08:11:52,612 - DEBUG - *** Resetting cache. ***
2025-05-29 08:11:52,612 - DEBUG - Returning 44820 tokens for processing.

May 29 '25 07:05 William-Droin

Im not familiar with Open WebUI, but it might have something to do with the "top_K" variable being differnt.

Also, you say that it uses 40k tokens and 8k tokens on the other method and then you post a picture of exactly the same thing you just said. Try adding code maybe? With LightRAG UI do you mean the LightRAG webui. With OpenWebUI what do you mean?

May 30 '25 14:05 frederikhendrix

Hi, thank you for your reply.

I am using all default options, I am just following the guidelines here: https://github.com/open-webui/open-webui/discussions/6286#discussioncomment-12032330

OpenWebUI is meant to be officially supported now by LightRag through an "Ollama API emulation" as stated int he comment I sent above.

There is no code to provide because I am not running my own code, I start OpenWebUi in a docker container (https://github.com/open-webui/open-webui), then I start the light rag server and finally I integrate LightRAG with OpenWebUi following the steps here: https://github.com/HKUDS/LightRAG/blob/main/lightrag/api/README.md#about-ollama-api

There is nothing more I can configure inside OpenWebUi, it should apparently just work as stated in the links I sent before. However, it behaves very differently to when I am querying through the LightRAG webui.

I would like to know if there was anyway I could configure that Ollama endpoint emulated endpoint in LightRAG to behave the same as when I query through the LightRAG web UI ?

May 30 '25 15:05 William-Droin

+1 I have bumped into the exactly same problem.

Jun 09 '25 12:06 sin-mike

If you add /local or /global before your query in openwebui, you see more or less the same results. I haven;t found a way to have this added automatically in the Openwebui settings.

Jun 11 '25 09:06 RolphH

I noticed two things:

LightRag UI uses /global mode to do the query. When using OpenwebUI the mode in hybrid.
LightRag UI sets the top_k parameter to 10, while when using OpenwebUI, it will use your default set in LightRag environment variables TOP_K, which by default is 60.

So the have the same results from OpenwebUI, you will need to set TOP_K environment variables to 10, and start your query with /global <rest of your query>.

Jun 11 '25 09:06 RolphH

I noticed two things:
1. LightRag UI uses /global mode to do the query. When using OpenwebUI the mode in hybrid.

2. LightRag UI sets the `top_k` parameter to 10, while when using OpenwebUI, it will use your default set in LightRag  environment variables `TOP_K`, which by default is 60.
So the have the same results from OpenwebUI, you will need to set TOP_K environment variables to 10, and start your query with /global <rest of your query>.

The Custom Parameters feature for OpenWebUI model settings doesn't help neither, unfortunately

Jun 15 '25 21:06 sin-mike

I solved it in OpenWebUI by using a filter to add (prepend) text before it is being submitted to the LLM (LightRag). You can use https://openwebui.com/f/anfi/add_or_delete_text for example.

Jun 16 '25 06:06 RolphH

I solved it in OpenWebUI by using a filter to add (prepend) text before it is being submitted to the LLM (LightRag). You can use https://openwebui.com/f/anfi/add_or_delete_text for example.

Can't understand, would you mind providing an example?

Jun 16 '25 12:06 sin-mike

I solved it in OpenWebUI by using a filter to add (prepend) text before it is being submitted to the LLM (LightRag). You can use https://openwebui.com/f/anfi/add_or_delete_text for example.

I will try this today and report back if it is solving the issue for me as well

Jun 18 '25 14:06 William-Droin

I solved it in OpenWebUI by using a filter to add (prepend) text before it is being submitted to the LLM (LightRag). You can use https://openwebui.com/f/anfi/add_or_delete_text for example.

I will try this today and report back if it is solving the issue for me as well

Hi, how are you doing? Did it work for you?

Jul 02 '25 12:07 sin-mike

The discrepancy between Open WebUI and LightRAG UI results stems from conversation history being forwarded by Open WebUI to LightRAG. When LightRAG’s HISTORY_TURNS environment variable is non-zero, the system concatenates prior turns with the current query, degrading retrieval quality. HISTORY_TURNS was initially set to 3 in early builds and has since been changed to 0. @gasper-lenic-xlab

Jul 24 '25 08:07 danielaskdd

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Oct 31 '25 22:10 github-actions[bot]

This issue has been automatically closed because it has not had recent activity. Please open a new issue if you still have this problem.

Nov 08 '25 22:11 github-actions[bot]