mt-aws-glacier icon indicating copy to clipboard operation
mt-aws-glacier copied to clipboard

Ignore SIGPIPE in child worker

Open kostko opened this issue 7 years ago • 4 comments

Without this PR, the child worker process will simply exit on SIGPIPE (e.g. when the TCP connection with the remote server breaks) and the parent will report something like this:

EXIT on SIGCHLD (signal 13, exit_code 0)

And then just terminate.

kostko avatar Dec 25 '16 20:12 kostko

actually, remote server should not ever break, this must be some bug somewhere. do you mean Amazon servers or internal IPC communucations?

vsespb avatar Dec 25 '16 20:12 vsespb

I am getting these errors consistently when uploading large files on a new system (Ubuntu 16.10). I did not encounter this before on Ubuntu 14.04, so I am not sure where the problem is (could be a change in some dependent module?).

Also not sure if this SIGPIPE comes from parent-child IPC or from the remote HTTP socket, I'll do some more debugging later, but the file upload seems to be working at the moment with this change (while previously it failed consistently after uploading at most 10 parts).

kostko avatar Dec 25 '16 20:12 kostko

Perhaps also the parent should retry when a child exits due to SIGPIPE?

kostko avatar Dec 25 '16 20:12 kostko

actually, remote server should not ever break

Why not? A TCP connection may reset at any time, although this may indicate some network/overload issues somwhere between me and AWS.

kostko avatar Dec 25 '16 20:12 kostko