Xee
Xee copied to clipboard
Long-running code results in `requests` `ChunkedEncodingError` exception (broken connection)
I have a script ingesting ~200 GB of landsat imagery with the current multi-threaded implementation (no Dataflow). Eventually, I always get an exception like:
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(9186238 bytes read, 1299762 more expected)', IncompleteRead(9
186238 bytes read, 1299762 more expected))
This occurs in the common.robust_getitem
call.
I've had some success in reducing the frequency of this exception by lowering the chunk size, so e.g. I can make it to ~150 GB instead of failing after 90, although hard to say if that improvement is reliable since it is non-deterministic.
I am not sure of the root cause of this-- it could be due to a multithreading/lock issue, or the server is prematurely closing the connection. Either way, the current code only applies the retry/backoff logic to EEException
s. I've had success by retrying on any Exception
rather than just EEException
but that is not an ideal solution.
I'd imagine that we don't see this in Dataflow because it has its own worker retry logic?