"netlink socket stream shut down" infinite loop
Summary
I have not yet identify how exactly, but when doing early shutdown of an application I see this printed forever with Ctrl+C not working, so looks like in the main thread somehow:
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }
2023-08-02 21:07:17.697 WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages
I think the solution here might be to be prepared for runtime shutdown at any time.
Version
0.52.1, but I observed this on previous versions too
Would you like to work on fixing this bug?
Maybe
Related: https://github.com/libp2p/rust-libp2p/issues/3390.
Related: https://github.com/libp2p/rust-libp2p/issues/2591.
This happens when during Substrate node initialization I press Ctrl+C, so it has to shut down before it even started properly, in this case above errors are printed until process is killed extrenally.
I am almost certain that this would be fixed by https://github.com/libp2p/rust-libp2p/issues/2591 because that would move the IO off the main task which is typically the one running in main.rs.
I just ran into the same issue. I think i tracked it down to the mdns polling if_watch in a loop. When rtnetlink socket is closed if_watch stream will return an error every time poll is called. So either the if_watch stream can be changed to return None (and thus end the stream) if the socket is closed or add an break statement to be mdns behaviour when if_watch returns an error.
- https://github.com/libp2p/rust-libp2p/blob/e28de77f32b2c550c6b627e2d1006290bb74657d/protocols/mdns/src/behaviour.rs#L327
- https://github.com/mxinden/if-watch/blob/f21df8166f79d215443823c381ea8b8336d85e3c/src/linux.rs#L107-L123
Actually I don't see it in the latest version of libp2p anymore. Are you on older version by any chance?
I'm on latest:
% cargo tree -i libp2p
libp2p v0.53.2
...
This happend to me in an unittest which runs very quickly. When i run this test on linux it blocks forever but runs fine on macOS.
The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.
The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.
It does return Poll::Pending but its just behind the ready! macro
https://github.com/mxinden/if-watch/blob/31c4780d803431914aab95ec9139aa8e5795f80c/src/linux.rs#L107-L123
will close this for now as @nazar-pc mentions it doesn't reproduce anymore. Anyone feel free to re-open if not