s3cmd
s3cmd copied to clipboard
s3cmd put files from tar stdin: [Errno 32] Broken pipe
I was trying to upload files from stdin by s3cmd, by this command below
tar cfz - folder | s3cmd put - s3://backups/abcd.tar
but sometimes I get ERROR
ERROR: Cannot retrieve any response status before encountering an EPIPE or ECONNRESET exception
WARNING: Upload failed: /abcd.tar?partNumber=21&uploadId=... ([Errno 32] Broken pipe)
WARNING: Waiting 3 sec...
I think s3cmd is waiting to long for TAR and Amazon S3 close connection
Is it possible for you to test the latest MASTER version of s3cmd? I think that this issue should be fixed with recent changes. (Amazon is doing funky thing with short connection timeouts in some conditions) Otherwise, the fixes should be available with the next release (>= 2.2.0)
I am also having this problem in 2.0.2, 2.1.0, installed with either apt or pip and 2.1.0+ (the current master version via the zip file on github). I was able to make it go away by setting my multipart-chunk-size-mb=128
Any values higher than 128 result in the broken pipe error. values of 128 or less work fine. It also works if I remove the option (because the default is 15mb)
I'm using digitalocean spaces instead of AWS, so the values may be irrelevant, but perhaps the option is significant in some way?
Also having this problem consistently using DO spaces.
- s3cmd version 2.0.1
- multipart-chunk-size-mb=75
upload: 'F1 2016/03 China 02.Race.Session.SD.mp4' -> 's3://bucket/vidcap/F1 2016/03 China 02.Race.Session.SD.mp4' [part 18 of 23, 75MB] [2 of 56] 524288 of 78643200 0% in 14s 35.47 kB/s failed ERROR: Cannot retrieve any response status before encountering an EPIPE or ECONNRESET exception WARNING: Upload failed: /vidcap/F1%202016/03%20China%2002.Race.Session.SD.mp4?partNumber=18&uploadId=2~dD9pLHJSmLt3KnCztOxd9xnW2CJYAw_ ([Errno 32] Broken pipe) WARNING: Retrying on lower speed (throttle=0.00) WARNING: Waiting 3 sec... upload: 'F1 2016/03 China 02.Race.Session.SD.mp4' -> 's3://bucket/vidcap/F1 2016/03 China 02.Race.Session.SD.mp4' [part 18 of 23, 75MB] [2 of 56] 78643200 of 78643200 100% in 65s 1173.53 kB/s done
I had hoped the fix described in https://github.com/s3tools/s3cmd/issues/1114 fixes my problem as well, but I still have it in s3cmd version 2.2.0:
ERROR: Cannot retrieve any response status before encountering an EPIPE or ECONNRESET exception
WARNING: Upload failed: /path/to/some/file.tar?partNumber=1&uploadId=2~fHTq7x2OS-KGEL0bitYYvUKms1mtqEi ([Errno 32] Broken pipe)
WARNING: Waiting 3 sec...
ERROR: Cannot retrieve any response status before encountering an EPIPE or ECONNRESET exception
WARNING: Upload failed: /path/to/some/file.tar?partNumber=2&uploadId=2~fHTq7x2OS-KGEL0bitYYvUKms1mtqEi ([Errno 32] Broken pipe)
WARNING: Waiting 3 sec...
- It eventually uploads fine, but I suspect those errors prolong the upload time.
- The upload is not to an AWS instance but to an S3 instance of a non-public and non-profit provider for weather and climate data.
Edit:
To test the multipart-chunk-size-mb=128
- where do I set that? I tried in my ~/.s3cfg
with and without spaces around '=', but it always says WARNING: Ignoring invalid line in '/home/<user>/.s3cfg': multipart-chunk-size-mb=128
or WARNING: Ignoring invalid line in '/home/<user>/.s3cfg': multipart-chunk-size-mb = 128
@aurisnoctis You can try to play with the 2 following parameters in the config:
connection_pooling = True
# How long in seconds a connection can be kept idle in the pool and still
# be alive. AWS s3 is supposed to close connections that are idle for 20
# seconds or more, but in real life, undocumented, it closes https conns
# after around 6s of inactivity.
connection_max_age = 5
First you can try reducing the max_age to something like 1s, and if you are still encountering the issue, just disable the "reuse" of connections with connection_pooling=False
.
Please let me know if it fixes you issue?
@fviard Thank you. I'm testing.
Is the multipart-chunk-size-mb=128
option mentioned in https://github.com/s3tools/s3cmd/issues/1127#issuecomment-817027287 obsolete ?
No, it is different usage, and so different way to try to fix a similar effect problem.
The bigger your chunk size, the more chance that there is a network or server error and that you will have to retry. And in that case the whole chunk. But normally, if you network is not too bad, he hundreds of mb uploads should be ok.
But sometimes, when the chunk is too big, the md5/processing time between 2 chunks could be too big for the server timeout between 2 requests.
On the other side, the two last variables control the fact that we reuse connection after each request instead of losing time setting up a new connection for each request. The problem is that sometimes the server have too low timeout for persisting connection between requests and so it closes them without notifying us. So we will just fail at the next request when trying to send it with the previous connection.
Sometimes, cheaper/crappy services could even not accept more than one request per connection.
So, the thing is to try to understand in each case what is the behavior of the server for reused connections and timeout. Sometimes it is in their documentation. But other times like AWS, they have special undocumented timeout for some endpoints but not all.
Le jeu. 14 oct. 2021 à 13:14, aurisnoctis @.***> a écrit :
@fviard https://github.com/fviard Thank you. I'm testing. Is the multipart-chunk-size-mb=128 option mentioned in #1127 (comment) https://github.com/s3tools/s3cmd/issues/1127#issuecomment-817027287 obsolete ?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/s3tools/s3cmd/issues/1127#issuecomment-943258239, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIBJJHFOE7FXO5PZLGQ6JTUG23S3ANCNFSM4QLG4BUQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
This tested combination works for me in ~/.s3cfg
:
connection_pooling = True
connection_max_age = 1
No more ERROR: Cannot retrieve any response status before encountering an EPIPE or ECONNRESET
. :+1:
In case someone else is wondering about the chunk size option: https://github.com/s3tools/s3cmd/issues/1127#issuecomment-817027287 and https://github.com/s3tools/s3cmd/issues/1127#issuecomment-832881469 state multipart-chunk-size-mb=128
. This line in ~/.s3cfg caused errors (with or without spaces around =
). Now I found out that this formatting is a valid command line option, but not a valid line in the s3cmd config file, where underscores are used instead of hyphens, see also https://s3tools.org/kb/item14.htm:
For instance, you could change the multipart_chunk_size_mb default value from 15 to 5, and that would become the new default value for the s3cmd option --multipart-chunk-size-mb.
Thanks again!