ipc-channel icon indicating copy to clipboard operation
ipc-channel copied to clipboard

Sending a sender over a channel and immediately receiving from the corresonding receiver intermittently fails on linux

Open jdm opened this issue 5 years ago • 2 comments

https://github.com/servo/ipc-channel/pull/179 This is one of our longest-existing and most distributed intermittent failures in Servo's automated tests.

jdm avatar Jan 09 '20 19:01 jdm

https://github.com/servo/ipc-channel/blob/84a1483420ba60110c2388aaa3062511339976d1/src/platform/unix/mod.rs#L1018-L1025 If the error we get is an EINTR coming from here, we should retry the recvmsg, not abort. From what I'm reading, EINTR can happen here at random when the process receives basically any UNIX signal. An example of a loop that exists for the sake of retrying on EINTR, from wine: https://github.com/wine-mirror/wine/blob/9642f35922b79cebacdc774eb54619e389ccd531/dlls/ntdll/server.c#L793 A more Rust-specific warning about this: https://people.gnome.org/~federico/blog/rust-libstd-syscalls-and-errors.html

pshaughn avatar Feb 06 '20 21:02 pshaughn

I jammed an assert that it wasn't EINTR in there and started running the test from #179 in a loop in a bunch of terminals but I'm not getting either that test's failure or my own assertion failure; my environment (a Mint VM on Windows host) might not be vulnerable to whatever problem this is. Someone who can reproduce this easily should see which Unix error code it is. (If it is this, every blocking libc call that mentions EINTR as a possible error in its man page should have similar retry logic applied, it's not exclusive to recvmsg.)

pshaughn avatar Feb 06 '20 22:02 pshaughn