internetarchive
internetarchive copied to clipboard
Option to fail and retry on slow upload speeds
Most of my uploads are running fine, but occasionally, it seems that I'm hitting a bad S3 node or similar and end up with upload speeds well below 1 MB/s and wonderful progress bars such as:
24/5121 [02:30<11:06:08, 7.84s/it]
(I'll happily help with debugging this if I can in any way, but that's not the point of this issue.)
I propose adding an option (or options) similar to curl's --speed-limit and --speed-time to ia upload: if the average transfer speed during the time window is below the limit, the transfer gets aborted and retried (if applicable according to the --retries setting). This would prevent such bad connections or nodes from blocking uploads for many hours.
I briefly looked into how this could be implemented. Requests and the underlying urllib3 don't have such a feature, and the response on a similar Stack Overflow question is to use urllib2 with threading instead. One idea worth exploring might be a wrapper around the body object that raises an exception if it's not being read from quickly enough. That exception could then be caught again higher up and be turned into a nice error message instead of a giant traceback. However, there's no documented guarantee of the read block size by http.client, so not sure if this might be unreliable.
I'm also suffering from this.
I'm uploading using csv. First upload goes fine full speed. Second one sometimes goes full speed. Third one always slows to 40s/1MB.