[Feature Request]: LLM fails, no retry or temp. storage of already processed chunks.
Do you need to file a feature request?
- [x] I have searched the existing feature request and this feature request is not already filed.
- [x] I believe this is a legitimate feature request, not just a question or bug.
Feature Request Description
Do you need to file an issue?
- [X] I have searched the existing issues and this bug is not already filed.
- [X] I believe this is a legitimate bug, not just a question or feature request.
Describe the bug
A large document with 120 chunks fails multiple times when the LLM doesn't work on one of the 120 chunks. There should be a mechanism to retry the LLM a few times before failing and/or store the already processed chunks incl. embeddings. This is a feature request
Failed to extract entities and relationships: Expecting value: line 443 column 1 (char 2431)
Traceback (most recent call last):
File "/app/lightrag/lightrag.py", line 1589, in process_document
await entity_relation_task
File "/app/lightrag/lightrag.py", line 1814, in _process_extract_entities
raise e
File "/app/lightrag/lightrag.py", line 1799, in _process_extract_entities
chunk_results = await extract_entities(
^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lightrag/operate.py", line 2070, in extract_entities
raise first_exception
File "/app/lightrag/operate.py", line 2032, in _process_with_semaphore
return await _process_single_content(chunk)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lightrag/operate.py", line 1944, in _process_single_content
final_result = await use_llm_func_with_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lightrag/utils.py", line 1728, in use_llm_func_with_cache
res: str = await use_llm_func(safe_input_text, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lightrag/utils.py", line 835, in wait_func
return await future
^^^^^^^^^^^^
File "/app/lightrag/utils.py", line 539, in worker
result = await asyncio.wait_for(
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/asyncio/tasks.py", line 520, in wait_for
return await fut
^^^^^^^^^
File "/app/lightrag/api/lightrag_server.py", line 363, in openai_alike_model_complete
return await openai_complete_if_cache(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/tenacity/asyncio/__init__.py", line 189, in async_wrapped
return await copy(fn, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/tenacity/asyncio/__init__.py", line 111, in __call__
do = await self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/tenacity/asyncio/__init__.py", line 153, in iter
result = await action(retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/tenacity/_utils.py", line 99, in inner
return call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/tenacity/__init__.py", line 400, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/root/.local/lib/python3.12/site-packages/tenacity/asyncio/__init__.py", line 114, in __call__
result = await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/lightrag/llm/openai.py", line 190, in openai_complete_if_cache
response = await openai_async_client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/resources/chat/completions/completions.py", line 2583, in create
return await self._post(
^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/_base_client.py", line 1794, in post
return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/_base_client.py", line 1599, in request
return await self._process_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/_base_client.py", line 1688, in _process_response
return await api_response.parse()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/_response.py", line 430, in parse
parsed = self._parse(to=to)
^^^^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/openai/_response.py", line 265, in _parse
data = response.json()
^^^^^^^^^^^^^^^
File "/root/.local/lib/python3.12/site-packages/httpx/_models.py", line 832, in json
return jsonlib.loads(self.content, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 338, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/json/decoder.py", line 356, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 443 column 1 (char 2431)
Steps to reproduce
No response
Expected Behavior
No response
LightRAG Config Used
Paste your config here
Logs and screenshots
No response
Additional Information
- LightRAG Version:
- Operating System:
- Python Version:
- Related Issues:
Additional Context
No response
The issue likely stems from illegal Unicode encoding within the file. The latest version rigorously filters input files for encoding compliance. Please test with the most current version, either by installing from source or by using the following method to install v1.4.8rc8:
pip install "lightrag-hku[api]"==1.4.8rc8
Can't other models be used to process documents? There is no deployment of ollama locally
@danielaskdd there is no issue with the character encoding. I was able to process the document after a few tries. This however stresses the token balance of the account it it's a large document with 120 chunks and they need embedding, re-ranking and LLM usage.