netlink
netlink copied to clipboard
Fix recvfrom goroutine leak
unix.Recvfrom does not return after fd close.
Fixes #792
About why old TestIfSocketCloses passed:
SetReceiveTimeout
make Receive()
always return unix.EAGAIN
Looks like this one may have introduced a regression; see https://github.com/moby/moby/pull/46982#issuecomment-2315840500
EINTR on netlink sockets is a new one. I suspect it has more to do with the netlink dependency bump you pulled in when rebasing than on the Go toolchain bump. I think the bug is here: https://github.com/vishvananda/netlink/blob/92645823f36c7ed03faf4baa566078d9d5e06fda/nl/nl_linux.go#L821-L824 It retries on
EWOULDBLOCK
(a.k.a.EAGAIN
) but neglects to retry onEINTR
.
@thaJeztah IO calls on non-blocking sockets will never return -EINTR
. The problem here is that Moby calls SetSocketTimeout
, which sets SO_SNDTIMEO
and SO_RCVTIMEO
. These socket options are only useful for sockets in blocking mode. Setting these probably places the socket back into blocking mode.
https://github.com/moby/moby/blob/92a05cf4148adc26054a5b52b191e0b4d13112cd/libnetwork/ns/init_linux.go#L25-L40
https://github.com/vishvananda/netlink/blob/92645823f36c7ed03faf4baa566078d9d5e06fda/nl/nl_linux.go#L848-L860
I'm not very familiar with this project, but it seems to me that before this PR, only blocking IO is used. This PR adds calls to set the socket as non-blocking, but still allows setting the timeout socket options.
Thanks; let me check with the networking folks about those parts.
Tracing through the kernel sources, I can confidently claim that the timeout sockopts have no effect when the O_NONBLOCK
fcntl is enabled. That's a red herring.
-
O_NONBLOCK
on the socket forces theMSG_DONTWAIT
flag to be set on allrecv[from]
syscalls -
recv
on a netlink socket is implemented in terms ofskb_recv_datagram
-
The implementation of
skb_recv_datagram
only blocks if the timeout is non-zero
The timeout for a non-blocking socket would be zero due to the MSG_DONTWAIT
flag being forced on; therefore setting the timeout sockopts must have no effect on the behavior of the recv
syscall.
Need further proof? The timeouts on new sockets are initialized to LONG_MAX
. And there is no magic hiding in the setsockopt
syscall.
System calls that "shouldn't" return -EINTR
have been known to return -EINTR
in practice when a signal is delivered to the task while in a syscall. See e.g. https://go.dev/issue/38033. Maybe it's a kernel bug which results in a recv on a non-blocking socket returning with -EINTR
upon receiving a signal despite the SA_RESTART
sigaction when a blocking recv would have been implicitly restarted?
This all still feels very strange to me.
I checked tokio
's implementation of TcpStream
and followed the references all the way down to the libc calls. There's no handling of -EINTR
(std::io::ErrorKind::Interrupted
) at all. The netlink-sys crate's tokio socket implementation does not handle -EINTR
either.
I actually rolled my own rtnetlink
Go package a while ago, and it never had to deal with -EINTR
.
Unlike Go programs, Rust programs don't (inherently) have a runtime which sends SIGURG to the process baked in.
Unlike Go programs, Rust programs don't (inherently) have a runtime which sends SIGURG to the process baked in.
Tokio does allow you to register for receiving signals. It's actually quite similar to os/signal
in terms of usage. I imagine there are programs that use Tokio for network IO and receiving signals at the same time.
This change breaks the timeouts, I can raise a PR to fix that.
But, the EINTR
we've been seeing in moby CI is unrelated - it's due to https://github.com/vishvananda/netlink/pull/925, exposing an NLM_F_DUMP_INTR (reported in the netlink response) as an EINTR ... https://github.com/vishvananda/netlink/blob/92645823f36c7ed03faf4baa566078d9d5e06fda/nl/nl_linux.go#L582-L584