openeo-python-client icon indicating copy to clipboard operation
openeo-python-client copied to clipboard

IncompleteRead exception crashes the JobManager

Open VictorVerhaert opened this issue 6 months ago • 2 comments

While running long jobs using the JobManager, it crashes while trying to download results.

Traceback (most recent call last):
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 748, in _error_catcher
    yield
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 894, in _raw_read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(1293825744 bytes read, 627790347 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/requests/models.py", line 820, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 1060, in stream
    data = self.read(amt=amt, decode_content=decode_content)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 977, in read
    data = self._raw_read(amt)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 872, in _raw_read
    with self._error_catcher():
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/urllib3/response.py", line 772, in _error_catcher
    raise ProtocolError(arg, e) from e
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(1293825744 bytes read, 627790347 more expected)', IncompleteRead(1293825744 bytes read, 627790347 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/victor.verhaert/LCFM/lcfm-production/notebooks/JM-LCFM.py", line 137, in <module>
    job_manager.run_jobs(
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/extra/job_management.py", line 273, in run_jobs
    self._update_statuses(df)
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/extra/job_management.py", line 433, in _update_statuses
    self.on_job_done(the_job, df.loc[i])
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/extra/job_management.py", line 373, in on_job_done
    job.get_results().download_files(target=job_dir)
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/rest/job.py", line 502, in download_files
    downloaded = [a.download(target) for a in self.get_assets()]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/rest/job.py", line 502, in <listcomp>
    downloaded = [a.download(target) for a in self.get_assets()]
                  ^^^^^^^^^^^^^^^^^^
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/openeo/rest/job.py", line 378, in download
    for block in response.iter_content(chunk_size=chunk_size):
  File "/home/victor.verhaert/LCFM/lcfm-production/.conda/lib/python3.11/site-packages/requests/models.py", line 822, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(1293825744 bytes read, 627790347 more expected)', IncompleteRead(1293825744 bytes read, 627790347 more expected))

We need to make the job manager more robust to these type of exceptions

VictorVerhaert avatar Aug 14 '24 09:08 VictorVerhaert