ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: [ERROR]Fail to bind LLM used by RAPTOR: {}

Open senovr opened this issue 2 weeks ago • 8 comments

Is there an existing issue for the same bug?

  • [x] I have checked the existing issues.

RAGFlow workspace code commit ID

g891ee85f

RAGFlow image version

v0.16.0-36-g891ee85f slim

Other environment information

Ubuntu 22.04

Actual behavior

I have the following strange behavior during knowledge base creation:

  1. Initially, I have system model set to locally (ollama) hosted mistral-small.
  2. Later, I switched the system model to GPU-hosted llama 70b (server with GPU, openai-compatible api)

Image

  1. I am creating new KB with the following settings (only relevant part is shown):

Image

  1. I am uploading the documents to KB, starting parsing.
  2. It successfully creates chunks and questions, but fails during raptor process.
  3. In interface, it throws the following error:
17:47:26 Page(0~100000000): Reused previous task's chunks.
17:47:28 Start RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval).
17:47:28 Task has been received.
17:47:43 [ERROR]Fail to bind LLM used by RAPTOR: {}
17:47:43 [ERROR][Exception]: {}
  1. When I am looking at executor logs, it shows the following logs (pls see additional information section)

Expected behavior

It seems for me that for some reason "raptor model" still points to ollama, ignoring fact that I changed system chat model @KevinHuSh , can you please look into this issue?

Steps to reproduce

See above

Additional information

2025-02-12 17:48:14,787 INFO     56 set_progress(4b62a1dae95011efb9030242ac120006), progress: -1, progress_msg: 17:48:14 [ERROR]Fail to bind LLM used by RAPTOR: {}
2025-02-12 17:48:14,797 ERROR    56 Fail to bind LLM used by RAPTOR: {}
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 499, in do_handle_task
    chunks, token_count = run_raptor(task, chat_model, embedding_model, vector_size, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 406, in run_raptor
    chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
  File "/ragflow/rag/raptor.py", line 134, in __call__
    raise th.result()
  File "/ragflow/rag/raptor.py", line 92, in summarize
    chunks.append((cnt, self._embedding_encode(cnt)))
  File "/ragflow/rag/raptor.py", line 51, in _embedding_encode
    embds, _ = self._embd_model.encode([txt])
  File "<@beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7fa9c3ff32e0>", line 31, in encode
  File "/ragflow/api/db/services/llm_service.py", line 235, in encode
    embeddings, used_tokens = self.mdl.encode(texts)
  File "<@beartype(rag.llm.embedding_model.OllamaEmbed.encode) at 0x7fa9c68e1900>", line 31, in encode
  File "/ragflow/rag/llm/embedding_model.py", line 262, in encode
    res = self.client.embeddings(prompt=txt,
  File "/ragflow/.venv/lib/python3.10/site-packages/ollama/_client.py", line 201, in embeddings
    return self._request(
  File "/ragflow/.venv/lib/python3.10/site-packages/ollama/_client.py", line 74, in _request
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: {}
2025-02-12 17:48:14,802 INFO     56 set_progress(4b62a1dae95011efb9030242ac120006), progress: -1, progress_msg: 17:48:14 [ERROR][Exception]: {}
2025-02-12 17:48:14,809 ERROR    56 handle_task got exception for task {"id": "4b62a1dae95011efb9030242ac120006", "doc_id": "f21521ace94f11ef92af0242ac120006", "from_page": 100000000, "to_page": 100000000, "retry_count": 0, "kb_id": "d3f84a3ce94f11ef8cfe0242ac120006", "parser_id": "naive", "parser_config": {"auto_keywords": 3, "auto_questions": 3, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n      {cluster_content}\nThe above is the content you need to summarize.", "max_token": 1024, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": false}, "chunk_token_num": 128, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "layout_recognize": "DeepDOC", "html4excel": false}, "name": "\u041e\u0431 \u0443\u0442\u0432\u0435\u0440\u0436\u0434\u0435\u043d\u0438\u0438 \u0444\u0435\u0434\u0435\u0440\u0430\u043b\u044c\u043d\u044b\u0445 \u043d\u043e\u0440\u043c \u0438 \u043f\u0440\u0430\u0432\u0438\u043b \u0432 \u043e\u0431\u043b\u0430\u0441\u0442\u0438 \u043f\u0440\u043e\u043c\u044b\u0448\u043b\u0435\u043d\u043d\u043e\u0439 \u0431\u0435\u0437\u043e\u043f\u0430\u0441\u043d\u043e\u0441\u0442\u0438 \u041f\u0440\u0430\u0432\u0438\u043b\u0430 \u0431\u0435\u0437\u043e\u043f\u0430\u0441\u043d\u043e\u0441\u0442\u0438 \u043e\u043f (1) (1).doc", "type": "doc", "location": "\u041e\u0431 \u0443\u0442\u0432\u0435\u0440\u0436\u0434\u0435\u043d\u0438\u0438 \u0444\u0435\u0434\u0435\u0440\u0430\u043b\u044c\u043d\u044b\u0445 \u043d\u043e\u0440\u043c \u0438 \u043f\u0440\u0430\u0432\u0438\u043b \u0432 \u043e\u0431\u043b\u0430\u0441\u0442\u0438 \u043f\u0440\u043e\u043c\u044b\u0448\u043b\u0435\u043d\u043d\u043e\u0439 \u0431\u0435\u0437\u043e\u043f\u0430\u0441\u043d\u043e\u0441\u0442\u0438 \u041f\u0440\u0430\u0432\u0438\u043b\u0430 \u0431\u0435\u0437\u043e\u043f\u0430\u0441\u043d\u043e\u0441\u0442\u0438 \u043e\u043f (1) (1).doc", "size": 1335481, "tenant_id": "f5ba73a0c6ab11ef93290242ac120006", "language": "English", "embd_id": "hf.co/yoeven/multilingual-e5-large-instruct-Q5_K_M-GGUF:latest@Ollama", "pagerank": 0, "kb_parser_config": {"auto_keywords": 3, "auto_questions": 3, "raptor": {"use_raptor": true, "prompt": "Please summarize the following paragraphs. Be careful with the numbers, do not make things up. Paragraphs as following:\n      {cluster_content}\nThe above is the content you need to summarize.", "max_token": 1024, "threshold": 0.1, "max_cluster": 64, "random_seed": 0}, "graphrag": {"use_graphrag": false}, "chunk_token_num": 128, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "layout_recognize": "DeepDOC", "html4excel": false}, "img2txt_id": "", "asr_id": "", "llm_id": "llama-3.1-70b-instruct___OpenAI-API@OpenAI-API-Compatible", "update_time": 1739371654320, "task_type": "raptor"}
Traceback (most recent call last):
  File "/ragflow/rag/svr/task_executor.py", line 626, in handle_task
    do_handle_task(task)
  File "/ragflow/rag/svr/task_executor.py", line 499, in do_handle_task
    chunks, token_count = run_raptor(task, chat_model, embedding_model, vector_size, progress_callback)
  File "/ragflow/rag/svr/task_executor.py", line 406, in run_raptor
    chunks = raptor(chunks, row["parser_config"]["raptor"]["random_seed"], callback)
  File "/ragflow/rag/raptor.py", line 134, in __call__
    raise th.result()
  File "/ragflow/rag/raptor.py", line 92, in summarize
    chunks.append((cnt, self._embedding_encode(cnt)))
  File "/ragflow/rag/raptor.py", line 51, in _embedding_encode
    embds, _ = self._embd_model.encode([txt])
  File "<@beartype(api.db.services.llm_service.LLMBundle.encode) at 0x7fa9c3ff32e0>", line 31, in encode
  File "/ragflow/api/db/services/llm_service.py", line 235, in encode
    embeddings, used_tokens = self.mdl.encode(texts)
  File "<@beartype(rag.llm.embedding_model.OllamaEmbed.encode) at 0x7fa9c68e1900>", line 31, in encode
  File "/ragflow/rag/llm/embedding_model.py", line 262, in encode
    res = self.client.embeddings(prompt=txt,
  File "/ragflow/.venv/lib/python3.10/site-packages/ollama/_client.py", line 201, in embeddings
    return self._request(
  File "/ragflow/.venv/lib/python3.10/site-packages/ollama/_client.py", line 74, in _request
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: {}

senovr avatar Feb 12 '25 15:02 senovr