R2R icon indicating copy to clipboard operation
R2R copied to clipboard

[Ingestion] trying to retry a job

Open sticky-note opened this issue 9 months ago • 2 comments
trafficstars

Describe the bug When we try to restart failed job, we are facing this issue in r2r v3.4.0

500: Error during ingestion: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value
Traceback (most recent call last):
File "/app/core/providers/database/files.py", line 224, in _read_lobject
raise R2RException(
shared.abstractions.exception.R2RException: Large object 352775 not found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/app/core/main/services/ingestion_service.py", line 236, in parse_file
await self.providers.database.files_handler.retrieve_file(
File "/app/core/providers/database/files.py", line 153, in retrieve_file
file_content = await self._read_lobject(conn, oid)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/core/providers/database/files.py", line 252, in _read_lobject
await conn.execute("SELECT lo_close($1)", lobject)
^^^^^^^
UnboundLocalError: cannot access local variable 'lobject' where it is not associated with a value

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/app/core/main/orchestration/hatchet/ingestion_workflow.py", line 108, in parse
async for extraction in extractions_generator:
File "/app/core/main/services/ingestion_service.py", line 286, in parse_file
raise R2RDocumentProcessingError(
shared.abstractions.exception.R2RDocumentProcessingError: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 139, in inner_callback
output = task.result()
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 265, in async_wrapped_action_func
raise e
File "/usr/local/lib/python3.12/site-packages/hatchet_sdk/worker/runner/runner.py", line 243, in async_wrapped_action_func
return await action_func(context)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/app/core/main/orchestration/hatchet/ingestion_workflow.py", line 300, in parse
raise HTTPException(
fastapi.exceptions.HTTPException: 500: Error during ingestion: Error parsing document: cannot access local variable 'lobject' where it is not associated with a value

To Reproduce Steps to reproduce the behavior:

  1. Import hundred of thousand files,
  2. Bring down service used for ingestion ( ollama/mbxai-embed/large for example )
  3. Jobs are going to fail
  4. Bring up service used for ingestion
  5. retry job
  6. See error

It is just one manner to reproduce, we faced this issue many times

Expected behavior The job retry happens succesfully

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: debian docker
  • Browser [e.g. chrome, safari] Not relevant
  • Version [e.g. 22] Not relevant

sticky-note avatar Feb 17 '25 23:02 sticky-note