rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

"netlink socket stream shut down" infinite loop

Open nazar-pc opened this issue 2 years ago • 8 comments

Summary

I have not yet identify how exactly, but when doing early shutdown of an application I see this printed forever with Ctrl+C not working, so looks like in the main thread somehow:

2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    

I think the solution here might be to be prepared for runtime shutdown at any time.

Version

0.52.1, but I observed this on previous versions too

Would you like to work on fixing this bug?

Maybe

nazar-pc avatar Aug 02 '23 18:08 nazar-pc

Related: https://github.com/libp2p/rust-libp2p/issues/3390.

thomaseizinger avatar Aug 03 '23 08:08 thomaseizinger

Related: https://github.com/libp2p/rust-libp2p/issues/2591.

thomaseizinger avatar Sep 20 '23 02:09 thomaseizinger

This happens when during Substrate node initialization I press Ctrl+C, so it has to shut down before it even started properly, in this case above errors are printed until process is killed extrenally.

nazar-pc avatar Sep 20 '23 04:09 nazar-pc

I am almost certain that this would be fixed by https://github.com/libp2p/rust-libp2p/issues/2591 because that would move the IO off the main task which is typically the one running in main.rs.

thomaseizinger avatar Sep 20 '23 07:09 thomaseizinger

I just ran into the same issue. I think i tracked it down to the mdns polling if_watch in a loop. When rtnetlink socket is closed if_watch stream will return an error every time poll is called. So either the if_watch stream can be changed to return None (and thus end the stream) if the socket is closed or add an break statement to be mdns behaviour when if_watch returns an error.

  • https://github.com/libp2p/rust-libp2p/blob/e28de77f32b2c550c6b627e2d1006290bb74657d/protocols/mdns/src/behaviour.rs#L327
  • https://github.com/mxinden/if-watch/blob/f21df8166f79d215443823c381ea8b8336d85e3c/src/linux.rs#L107-L123

dkuhnert avatar Jun 24 '24 09:06 dkuhnert

Actually I don't see it in the latest version of libp2p anymore. Are you on older version by any chance?

nazar-pc avatar Jun 24 '24 09:06 nazar-pc

I'm on latest:

% cargo tree -i libp2p
libp2p v0.53.2
...

This happend to me in an unittest which runs very quickly. When i run this test on linux it blocks forever but runs fine on macOS.

The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.

dkuhnert avatar Jun 24 '24 09:06 dkuhnert

The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.

It does return Poll::Pending but its just behind the ready! macro

https://github.com/mxinden/if-watch/blob/31c4780d803431914aab95ec9139aa8e5795f80c/src/linux.rs#L107-L123

dariusc93 avatar Jul 21 '24 15:07 dariusc93

will close this for now as @nazar-pc mentions it doesn't reproduce anymore. Anyone feel free to re-open if not

jxs avatar Nov 05 '24 23:11 jxs