internetarchive
internetarchive copied to clipboard
Broken pipe error when item is over size limit
When an item, in this case https://archive.org/details/files.pushshift.io_201812, is over it's size limit, the following error is returned:
requests.exceptions.ConnectionError: ('Connection aborted.', BrokenPipeError(32, 'Broken pipe'))
However, the error returned by Internet Archive is
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><Resource>This item total number of bytes(1116485106904) is over the per item size limit of 1099511627776. Please contact [email protected] for help fitting your data into the archive.</Resource><RequestId>920e7af2-677d-4f9b-ad9d-1681b53db715</RequestId></Error>
which is not shown to the user who got the broken pipe error.
Related is the following issue for requests
https://github.com/requests/requests/issues/2422 for broken pipe errors.
I got bitten by this. Been trying to upload ~2 TB of old archival data, and I've been stuck waiting for the past few days hoping it was just an intermittent issue that would go away after a while. For large files, the upload will run for 30 seconds before failing with the broken pipe message. I saw the same timings regardless of connection speed.
I finally thought to upload a small test file just to see what would happen, and got the error message once the upload finished. Frustratingly, the --status-check
flag doesn't indicate any problems with the item.
@jjjake, I noticed you've been making a few commits recently. Are you planning on working on some of the open issues? If not, would you accept PRs for them?
@AGSPhoenix PRs are always welcome.