rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

mDNS poll loop

Open rkuhn opened this issue 3 years ago • 6 comments

Since the update from 0.41.1 to 0.42.2 there is some poll amplification or loop going on in mDNS: when I include the Mdns behaviour, CPU usage is above 1 full core for the most time, with sporadic brief pauses (<1sec each). Without Mdns behaviour CPU usage is zero, as it should be. What is going on is the following:

2022-03-25T13:28:01.627833Z TRACE polling: Poller::wait(_, Some(143.795167566s))    
2022-03-25T13:28:01.627858Z TRACE polling::epoll: wait: epoll_fd=36, timeout=Some(143.795167566s)    
2022-03-25T13:28:01.627884Z TRACE polling::epoll: modify: epoll_fd=36, fd=38, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.627888Z TRACE polling::epoll: modify: epoll_fd=36, fd=40, ev=Event { key: 2, readable: false, writable: true }    
2022-03-25T13:28:01.627932Z TRACE polling::epoll: new events: epoll_fd=36, res=1    
2022-03-25T13:28:01.627930Z TRACE polling::epoll: modify: epoll_fd=36, fd=42, ev=Event { key: 4, readable: false, writable: true }    
2022-03-25T13:28:01.627962Z TRACE polling::epoll: modify: epoll_fd=36, fd=37, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.627998Z TRACE async_io::reactor: react: 1 ready wakers    
2022-03-25T13:28:01.628021Z TRACE ipfs_embed::net::peers: AddressBook polled
2022-03-25T13:28:01.628026Z TRACE async_io::driver: main_loop: waiting on I/O    
2022-03-25T13:28:01.628065Z TRACE async_io::reactor: process_timers: 0 ready wakers    
2022-03-25T13:28:01.628084Z TRACE netlink_sys::smol: poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })    
2022-03-25T13:28:01.628091Z TRACE polling: Poller::wait(_, Some(143.794912085s))    
2022-03-25T13:28:01.628128Z TRACE polling::epoll: wait: epoll_fd=36, timeout=Some(143.794912085s)    
2022-03-25T13:28:01.628154Z TRACE polling::epoll: modify: epoll_fd=36, fd=38, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.628182Z TRACE polling::epoll: new events: epoll_fd=36, res=1    
2022-03-25T13:28:01.628184Z TRACE polling::epoll: modify: epoll_fd=36, fd=40, ev=Event { key: 2, readable: false, writable: true }    
2022-03-25T13:28:01.628206Z TRACE polling::epoll: modify: epoll_fd=36, fd=37, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.628241Z TRACE async_io::reactor: react: 1 ready wakers    
2022-03-25T13:28:01.628269Z TRACE async_io::driver: main_loop: waiting on I/O    
2022-03-25T13:28:01.628267Z TRACE ipfs_embed::net::peers: AddressBook polled
2022-03-25T13:28:01.628302Z TRACE async_io::reactor: process_timers: 0 ready wakers    
2022-03-25T13:28:01.628327Z TRACE polling: Poller::wait(_, Some(143.794674307s))    
2022-03-25T13:28:01.628334Z TRACE netlink_sys::smol: poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })    
2022-03-25T13:28:01.628352Z TRACE polling::epoll: wait: epoll_fd=36, timeout=Some(143.794674307s)    
2022-03-25T13:28:01.628378Z TRACE polling::epoll: modify: epoll_fd=36, fd=38, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.628405Z TRACE polling::epoll: new events: epoll_fd=36, res=1    
2022-03-25T13:28:01.628429Z TRACE polling::epoll: modify: epoll_fd=36, fd=37, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.628463Z TRACE async_io::reactor: react: 1 ready wakers    
2022-03-25T13:28:01.628475Z TRACE ipfs_embed::net::peers: AddressBook polled
2022-03-25T13:28:01.628491Z TRACE async_io::driver: main_loop: waiting on I/O    
2022-03-25T13:28:01.628521Z TRACE async_io::reactor: process_timers: 0 ready wakers    
2022-03-25T13:28:01.628539Z TRACE netlink_sys::smol: poll_recv_from: Err(Os { code: 11, kind: WouldBlock, message: "Resource temporarily unavailable" })    
2022-03-25T13:28:01.628546Z TRACE polling: Poller::wait(_, Some(143.794454524s))    
2022-03-25T13:28:01.628580Z TRACE polling::epoll: wait: epoll_fd=36, timeout=Some(143.794454524s)    
2022-03-25T13:28:01.628606Z TRACE polling::epoll: modify: epoll_fd=36, fd=38, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T13:28:01.628640Z TRACE polling::epoll: modify: epoll_fd=36, fd=42, ev=Event { key: 4, readable: false, writable: true }    
2022-03-25T13:28:01.628680Z TRACE polling::epoll: new events: epoll_fd=36, res=1    

At first I thought if-watchrtnetlink was to blame, because that was emitting several DEBUG lines per loop (filtered out from the above listing), but that is just a symptom as I confirmed by debugging the if-watch example. The issue manifests itself in that async_io reactor always finds one waker to wake, which wakes the swarm containing the Mdns behaviour.

Any hints on what might cause this are greatly appreciated. I already put in logging for timers in Mdns, which would be in the above listing if timers were reset/fired/polled (there are no mDNS responses in my network, so no records to expire).

rkuhn avatar Mar 25 '22 13:03 rkuhn

addendum: lsof -nP output showing the relevant FDs

actyx   29873 rkuhn   35u  netlink                           0t0 877398060 ROUTE
actyx   29873 rkuhn   36u  a_inode               0,13          0      9494 [eventpoll]
actyx   29873 rkuhn   37u  a_inode               0,13          0      9494 [eventfd]
actyx   29873 rkuhn   38u  a_inode               0,13          0      9494 [timerfd]
actyx   29873 rkuhn   39u     IPv4          877389402        0t0       UDP *:5353 
actyx   29873 rkuhn   40u     IPv4          877389403        0t0       UDP 95.217.194.97:44860 
actyx   29873 rkuhn   41u     IPv4          877389404        0t0       UDP *:5353 
actyx   29873 rkuhn   42u     IPv4          877389405        0t0       UDP 172.17.0.1:45727 
actyx   29873 rkuhn   43u     IPv6          877398061        0t0       TCP *:4001 (LISTEN)
actyx   29873 rkuhn   44u     IPv4          877398062        0t0       TCP *:4001 (LISTEN)

rkuhn avatar Mar 25 '22 13:03 rkuhn

okay, here’s the really weird part: with this patch to async-io the poll loop gets broken after some time, settling down into its proper “wait for netlink messages” state

diff --git a/src/reactor.rs b/src/reactor.rs
index b6ae153..7e79478 100644
--- a/src/reactor.rs
+++ b/src/reactor.rs
@@ -382,9 +382,11 @@ impl Direction {
     /// Moves all wakers into a `Vec`.
     fn drain_into(&mut self, dst: &mut Vec<Waker>) {
         if let Some(w) = self.waker.take() {
+            log::trace!("waking poller");
             dst.push(w);
         }
         for (_, opt) in self.wakers.iter_mut() {
+            log::trace!("waking waiter");
             if let Some(w) = opt.take() {
                 dst.push(w);
             }

I then tried generating a netlink message by adding an IP address, which first did the right and expected thing and then spun a few dozen loops before going back to sleep again. So this looks like a race condition that when triggered leads to endless wakeups of the swarm.

rkuhn avatar Mar 25 '22 15:03 rkuhn

It looks like epoll is being interrupted by registering read interest, which immediately notifies, and waits, upon which this vicious cycle restarts.

2022-03-25T15:59:07.068412Z TRACE async_io::reactor: woke up after IO    
2022-03-25T15:59:07.068435Z TRACE async_io::reactor: waking poller    
2022-03-25T15:59:07.068460Z TRACE async_io::reactor: react: 1 ready wakers    
2022-03-25T15:59:07.068491Z TRACE async_io::driver: main_loop: waiting on I/O    
2022-03-25T15:59:07.068521Z TRACE async_io::reactor: process_timers: 0 ready wakers    
2022-03-25T15:59:07.068537Z DEBUG netlink_proto::connection: reading incoming messages    
2022-03-25T15:59:07.068545Z TRACE polling: Poller::wait(_, Some(297.113158634s))    
2022-03-25T15:59:07.068568Z TRACE polling::epoll: wait: epoll_fd=36, timeout=Some(297.113158634s)    
2022-03-25T15:59:07.068573Z DEBUG netlink_proto::codecs: NetlinkCodec: decoding next message    
2022-03-25T15:59:07.068591Z TRACE polling::epoll: modify: epoll_fd=36, fd=38, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T15:59:07.068611Z TRACE async_io::reactor: reusing existing waker 0    
2022-03-25T15:59:07.068639Z DEBUG netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2022-03-25T15:59:07.068669Z DEBUG netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2022-03-25T15:59:07.068699Z DEBUG netlink_proto::connection: handling requests    
2022-03-25T15:59:07.068726Z DEBUG netlink_proto::connection: sending messages    
2022-03-25T15:59:07.068769Z TRACE async_io::reactor: modify from poll    
2022-03-25T15:59:07.068794Z TRACE polling::epoll: modify: epoll_fd=36, fd=43, ev=Event { key: 5, readable: true, writable: false }    
2022-03-25T15:59:07.068823Z TRACE polling::epoll: new events: epoll_fd=36, res=1    
2022-03-25T15:59:07.068823Z TRACE async_io::reactor: reusing existing waker 0    
2022-03-25T15:59:07.068846Z TRACE polling::epoll: modify: epoll_fd=36, fd=37, ev=Event { key: 18446744073709551615, readable: true, writable: false }    
2022-03-25T15:59:07.068852Z TRACE async_io::reactor: reusing existing waker 0    
2022-03-25T15:59:07.068872Z TRACE async_io::reactor: woke up after IO    

rkuhn avatar Mar 25 '22 16:03 rkuhn

This rabbit hole is deeper than I can follow right now:

  • epoll immediately returns readable
  • since no EPOLLET option is set, this indicates that the socket still has bytes, but those bytes are not consumed between epoll_wait calls (and we note the conspicuous absence of generated events in the logs above)
  • I have absolutely no clue where to look for the offending code

rkuhn avatar Mar 25 '22 16:03 rkuhn

Okay @mxinden I’m pretty sure I know how it happens, but I’m not sure how to fix it:

  • InterfaceState::poll registers read interest in the mDNS receiver socket (possibly after having drained received datagrams)
  • sometimes this works as intended, namely that epoll doesn’t return this socket as readable until something happens — I think this is the case when read interest is registered before the async-io run loop calls epoll_wait again
  • but most of the time, epoll_wait immediately returns, saying that the mDNS receiver socket is ready, but there are no messages there and the whole cycle begins again

If I remove self.recv_socket.poll_readable(cx) from the poll function in iface.rs, then everything works nicely, including mDNS discovery. I suppose the functionality will be rescued by one of the various timers that wake up the behaviour in any case, leading to poll calls. I’ll probably patch libp2p-mdns in this fashion so that Actyx works again, but what is the proper solution?

rkuhn avatar Mar 25 '22 19:03 rkuhn

Sleeping on it, I have doubts on the current design of libp2p-mdns: I don’t think the UDP ports should be polled every time something unrelated happens in the rest of the Swarm, and I also don’t think that the rest of the Swarm should be polled every time something happens on those UDP ports.

rkuhn avatar Mar 26 '22 05:03 rkuhn

Tagging this as important because it performs IO on the main task that all NetworkBehaviours run on.

thomaseizinger avatar Sep 12 '23 01:09 thomaseizinger