internetarchive-downloader icon indicating copy to clipboard operation
internetarchive-downloader copied to clipboard

Periodic Exception using multi-part downloads: 'Connection broken: IncompleteRead(X bytes read, Y more expected)'

Open seanwo opened this issue 7 months ago • 0 comments

I have seen this exception several times after hundreds of multi-part downloads. Including the exception for future debugging. Usually it is very robust in timeouts, etc. This must be a new class of error that can occur when downloading from Internet Archives. I just run it again and it recovers so not a big deal but I figured I would provide the call stack.

2024-07-30 10:14:39 - ERROR - Exception occurred:
Traceback (most recent call last):
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 748, in _error_catcher
    yield
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 894, in _raw_read
    raise IncompleteRead(self._fp_bytes_read, self.length_remaining)
urllib3.exceptions.IncompleteRead: IncompleteRead(403013341 bytes read, 967567076 more expected)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/requests/models.py", line 820, in generate
    yield from self.raw.stream(chunk_size, decode_content=True)
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 1060, in stream
    data = self.read(amt=amt, decode_content=decode_content)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 977, in read
    data = self._raw_read(amt)
           ^^^^^^^^^^^^^^^^^^^
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 872, in _raw_read
    with self._error_catcher():
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/contextlib.py", line 158, in __exit__
    self.gen.throw(value)
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/urllib3/response.py", line 772, in _error_catcher
    raise ProtocolError(arg, e) from e
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(403013341 bytes read, 967567076 more expected)', IncompleteRead(403013341 bytes read, 967567076 more expected))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Volumes/redump/ia_downloader.py", line 1864, in main
    download(
  File "/Volumes/redump/ia_downloader.py", line 1181, in download
    download_pool.map(file_download, download_queue, chunksize=1)
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/Volumes/redump/ia_downloader.py", line 541, in file_download
    download_pool.map(file_download, download_queue, chunksize=1)
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 367, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
                    ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/[email protected]/3.12.4/Frameworks/Python.framework/Versions/3.12/lib/python3.12/multiprocessing/pool.py", line 48, in mapstar
    return list(map(*args))
           ^^^^^^^^^^^^^^^^
  File "/Volumes/redump/ia_downloader.py", line 745, in file_download
    for download_chunk in new_response.iter_content(
  File "/Users/seanwo/Environments/my_env/lib/python3.12/site-packages/requests/models.py", line 822, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(403013341 bytes read, 967567076 more expected)', IncompleteRead(403013341 bytes read, 967567076 more expected))

seanwo avatar Jul 30 '24 17:07 seanwo