internetarchive icon indicating copy to clipboard operation
internetarchive copied to clipboard

Option to fail and retry on slow upload speeds

Open JustAnotherArchivist opened this issue 5 years ago • 1 comments

Most of my uploads are running fine, but occasionally, it seems that I'm hitting a bad S3 node or similar and end up with upload speeds well below 1 MB/s and wonderful progress bars such as:

24/5121 [02:30<11:06:08,  7.84s/it]

(I'll happily help with debugging this if I can in any way, but that's not the point of this issue.)

I propose adding an option (or options) similar to curl's --speed-limit and --speed-time to ia upload: if the average transfer speed during the time window is below the limit, the transfer gets aborted and retried (if applicable according to the --retries setting). This would prevent such bad connections or nodes from blocking uploads for many hours.

I briefly looked into how this could be implemented. Requests and the underlying urllib3 don't have such a feature, and the response on a similar Stack Overflow question is to use urllib2 with threading instead. One idea worth exploring might be a wrapper around the body object that raises an exception if it's not being read from quickly enough. That exception could then be caught again higher up and be turned into a nice error message instead of a giant traceback. However, there's no documented guarantee of the read block size by http.client, so not sure if this might be unreliable.

JustAnotherArchivist avatar Mar 25 '20 01:03 JustAnotherArchivist

I'm also suffering from this.

I'm uploading using csv. First upload goes fine full speed. Second one sometimes goes full speed. Third one always slows to 40s/1MB.

gingerbeardman avatar Oct 30 '22 00:10 gingerbeardman