R2R
R2R copied to clipboard
[Ingestion] trying to retry a job
trafficstars
Describe the bug
When we try to restart failed job, we are facing this issue in r2r v3.4.0
500: Error during ingestion: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value
Traceback (most recent call last):
File "/app/core/providers/database/files.py", line 224, in _read_lobject
raise R2RException(
shared.abstractions.exception.R2RException: Large object 352775 not found.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/core/main/services/ingestion_service.py", line 236, in parse_file
await self.providers.database.files_handler.retrieve_file(
File "/app/core/providers/database/files.py", line 153, in retrieve_file
file_content = await self._read_lobject(conn, oid)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/core/providers/database/files.py", line 252, in _read_lobject
await conn.execute("SELECT lo_close($1)", lobject)
^^^^^^^
UnboundLocalError: cannot access local variable 'lobject' where it is not associated with a value
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/app/core/main/orchestration/hatchet/ingestion_workflow.py", line 108, in parse
async for extraction in extractions_generator:
File "/app/core/main/services/ingestion_service.py", line 286, in parse_file
raise R2RDocumentProcessingError(
shared.abstractions.exception.R2RDocumentProcessingError: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 139, in inner_callback
output = task.result()
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 265, in async_wrapped_action_func
raise e
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 243, in async_wrapped_action_func
return await action_func(context)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/core/main/orchestration/hatchet/ingestion_workflow.py", line 300, in parse
raise HTTPException(
fastapi.exceptions.HTTPException: 500: Error during ingestion: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value
To Reproduce Steps to reproduce the behavior:
- Import hundred of thousand files,
- Bring down service used for ingestion (
ollama/mbxai-embed/largefor example ) - Jobs are going to fail
- Bring up service used for ingestion
- retry job
- See error
It is just one manner to reproduce, we faced this issue many times
Expected behavior The job retry happens succesfully
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: debian docker
- Browser [e.g. chrome, safari] Not relevant
- Version [e.g. 22] Not relevant