rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

Dcutr panics with low connection limits

Open dariusc93 opened this issue 3 years ago • 4 comments

Summary

Receives a panic from dcutr when connecting to another peer

thread 'main' panicked at 'Peer of Prototype::DirectConnection is always known.', /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-dcutr-0.6.0/src/behaviour.rs:154:35`

This seems to happen when using the following ConnectionLimit for swarm.

ConnectionLimits::default()
                    .with_max_pending_incoming(Some(128))
                    .with_max_pending_outgoing(Some(128))
                    .with_max_established_incoming(Some(128))
                    .with_max_established_outgoing(Some(128))

https://github.com/dariusc93/libp2p-chat-test/blob/bugged/dcutr-behaviour-rs-154-35/src/behaviour.rs#L175-L181

thread 'main' panicked at 'Peer of `Prototype::DirectConnection` is always known.', /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-dcutr-0.6.0/src/behaviour.rs:154:35
 stack backtrace:
 0: rust_begin_unwind
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/panicking.rs:584:5
 1: core::panicking::panic_fmt
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:142:14
 2: core::panicking::panic_display
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:72:5
 3: core::panicking::panic_str
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/panicking.rs:56:5
 4: core::option::expect_failed
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/option.rs:1874:5
 5: core::option::Option<T>::expect
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/option.rs:738:21
 6: <libp2p_dcutr::behaviour::Behaviour as libp2p_swarm::behaviour::NetworkBehaviour>::inject_dial_failure
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-dcutr-0.6.0/src/behaviour.rs:154:27
 7: <libp2p_swarm::behaviour::toggle::Toggle<TBehaviour> as libp2p_swarm::behaviour::NetworkBehaviour>::inject_dial_failure
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-swarm-0.39.0/src/behaviour/toggle.rs:158:17
 8: <libp2p_chat::behaviour::ChatBehaviour as libp2p_swarm::behaviour::NetworkBehaviour>::inject_dial_failure
 at ./src/behaviour.rs:34:10
 9: libp2p_swarm::Swarm<TBehaviour>::dial_with_handler
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-swarm-0.39.0/src/lib.rs:538:17
 10: libp2p_swarm::Swarm<TBehaviour>::handle_behaviour_event
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-swarm-0.39.0/src/lib.rs:941:33
 11: libp2p_swarm::Swarm<TBehaviour>::poll_next_event
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-swarm-0.39.0/src/lib.rs:1075:56
 12: <libp2p_swarm::Swarm<TBehaviour> as futures_core::stream::Stream>::poll_next
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/libp2p-swarm-0.39.0/src/lib.rs:1218:9
 13: futures_util::stream::stream::StreamExt::poll_next_unpin
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.24/src/stream/stream/mod.rs:1626:9
 14: <futures_util::stream::stream::select_next_some::SelectNextSome<St> as core::future::future::Future>::poll
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.24/src/stream/stream/select_next_some.rs:34:36
 15: libp2p_chat::main::{{closure}}::{{closure}}
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/macros/select.rs:507:49
 16: <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/future/poll_fn.rs:38:9
 17: libp2p_chat::main::{{closure}}
 at ./src/main.rs:163:9
 18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/future/mod.rs:91:19
 19: tokio::park::thread::CachedParkThread::block_on::{{closure}}
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/park/thread.rs:267:54
 20: tokio::coop::with_budget::{{closure}}
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/coop.rs:102:9
 21: std::thread::local::LocalKey<T>::try_with
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/thread/local.rs:445:16
 22: std::thread::local::LocalKey<T>::with
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/std/src/thread/local.rs:421:9
 23: tokio::coop::with_budget
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/coop.rs:95:5
 24: tokio::coop::budget
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/coop.rs:72:5
 25: tokio::park::thread::CachedParkThread::block_on
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/park/thread.rs:267:31
 26: tokio::runtime::enter::Enter::block_on
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/enter.rs:152:13
 27: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/scheduler/multi_thread/mod.rs:79:9
 28: tokio::runtime::Runtime::block_on
 at /home/darius/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.0/src/runtime/mod.rs:492:44
 29: libp2p_chat::main
 at ./src/main.rs:287:5
 30: core::ops::function::FnOnce::call_once
 at /rustc/4b91a6ea7258a947e59c6522cd5898e7c0a6a88f/library/core/src/ops/function.rs:248:5

Thought it was related to a separate project I was working on but wanted to isolate it first to try to reproduce it with just libp2p. It does seem to do better with limits set to 512 or more but hard to really determine if thats the case and scanning through the code doesnt really give a clear answer.

Expected behaviour

Not to panic when using dcutr with connection limits set low.

Actual behaviour

Panics sometime after swarm

Debug Output

<output>

Possible Solution

Version

  • libp2p version: v0.48

Would you like to work on fixing this bug?

Maybe

dariusc93 avatar Sep 15 '22 01:09 dariusc93

Thanks for reporting this panic @dariusc93.

The root of the issue is here:

https://github.com/libp2p/rust-libp2p/blob/2c739e9bdb2c51199b7ade7f3aa0cb65671e5721/swarm/src/lib.rs#L538

When we hit the connection limit, we report the issue to the NetworkBehaviours without a PeerId even though we might have one available.

If I am not mistaken the patch for this panic would be simply:

-                self.behaviour.inject_dial_failure(None, handler, &error);
+                self.behaviour.inject_dial_failure(peer_id, handler, &error);

@dariusc93 could you validate the above and propose a patch in case it resolves your issue?

mxinden avatar Sep 16 '22 12:09 mxinden

Thanks for the response @mxinden, I did try that and it did resolve the panic I was getting with a low limit (which also allow the logs to see the error too)

dariusc93 avatar Sep 16 '22 19:09 dariusc93

@dariusc93 could you validate the above and propose a patch in case it resolves your issue?

@dariusc93 want to create a pull request? :innocent:

mxinden avatar Sep 21 '22 10:09 mxinden

Sorry @mxinden been busy so i forgot to create it. Will do that in just a bit!

dariusc93 avatar Sep 22 '22 02:09 dariusc93

Given that https://github.com/libp2p/rust-libp2p/pull/2928 is merged I am closing here. Thanks @dariusc93!

mxinden avatar Sep 27 '22 13:09 mxinden