rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

refactor(*): Transport redesign

Open umgefahren opened this issue 1 year ago • 26 comments

Description

Resolves: #4226. Resolves: #3953.

Notes & open questions

What do you think of the implementation?

Change checklist

  • [x] I have performed a self-review of my own code
  • [x] I have made corresponding changes to the documentation
  • [x] I have added tests that prove my fix is effective or that my feature works
  • [x] A changelog entry has been made in the appropriate crates

umgefahren avatar Sep 27 '23 21:09 umgefahren

I have implemented some of the suggestions. However I didn't touch the question on whether DialOpts should be struct or the field should just be passed as arguments.

I also moved some methods into a macro.

umgefahren avatar Sep 28 '23 09:09 umgefahren

Since we are touching PortUse anyway and we don't want to introduce breaking changes in the future: Is there a use for choosing not only if we should use an existing or new one, but also a specific one?

umgefahren avatar Sep 28 '23 09:09 umgefahren

At the moment I'm pretty pleased with the implementation. But I don't really like having the list of things that we wan't to strip. It feels like something that might cause a bug later on, but I honestly don't have a better idea.

umgefahren avatar Oct 04 '23 09:10 umgefahren

But I don't really like having the list of things that we wan't to strip. It feels like something that might cause a bug later on, but I honestly don't have a better idea.

As long as we cover it with unit tests, that is fine. We can always ship improvements as patch releases later if we find that we are missing something :)

thomaseizinger avatar Oct 05 '23 04:10 thomaseizinger

With the latest revisions, I would feel comfortable to present the changes I made at the next Open Maintainers call to gather feedback. If that's alright with @mxinden and @thomaseizinger

umgefahren avatar Oct 08 '23 15:10 umgefahren

With the latest revisions, I would feel comfortable to present the changes I made at the next Open Maintainers call to gather feedback. If that's alright with @mxinden and @thomaseizinger

Yes, we should definitely talk about it. I think the tests that are in place now are good enough to move forward. We should discuss the deprecation of dial_as_listener and how we want to model this instead! Plus, we need to use this new API everywhere and make the tests pass to ensure we are actually doing something meaningful here.

thomaseizinger avatar Oct 08 '23 23:10 thomaseizinger

Thank you for adding the tests. I have to admit, they are a bit too much magic for my taste :/

The issue is, what use does the test case have if I can't easily verify (by reviewing it) that it is actually testing the correct thing? I first have to verify that the test infrastructure works correctly. Dynamically generated test inputs make this even harder.

Do you know about https://crates.io/crates/test-case?

I'd suggest the following improvements, although they can be left for later in favor of making progress towards AutoNATv2 :)

  • Always only use one listen address
  • Always only use one dial address
  • Make two parameterized tests, one for positive, one for negative
  • Use simple parameters when the actual value doesn't matter, for example, 1.1.1.1
    #[test_case("/ip4/10.0.0.1/tcp/4000", "/ip4/1.1.1.1/tcp/8080"; "when dialing TCP on IPv4 with existing TCP listener")]
    fn should_reuse_port(listener_addr: &str, dial_addr: &str) {
		// impl
    }

    #[test_case("/ip4/10.0.0.1/tcp/4000", "/ip4/1.1.1.1/udp/8080/quic"; "when dialing QUIC on IPv4 with existing TCP listener")]
    fn should_not_reuse_port(listener_addr: &str, dial_addr: &str) {
		// impl
    }

I thought the test cases were really elegant. But the mentioned crate seems like a better way (although it will increase compile time). I will rewrite the test cases to be easier to verify.

umgefahren avatar Oct 09 '23 09:10 umgefahren

To make sure we are on the same page, I would expect the static libp2p_tcp::Config::port_reuse to go away, as we now have the dynamic configuration through DialOpts.

Me too!

thomaseizinger avatar Oct 19 '23 11:10 thomaseizinger

I have implemented your suggestions, formatted the code and did the necessary refactoring to the libp2p-tcp transport. The test cases for libp2p-swarm and libp2p-tcp should pass now. I think the first milestone of my grant is reached, if @mxinden would agree. Let me know as soon as possible so that I can issue a bill.

umgefahren avatar Oct 19 '23 13:10 umgefahren

The test cases for libp2p-swarm and libp2p-tcp should pass now.

Awesome! I'll have a look in detail today.

thomaseizinger avatar Oct 19 '23 19:10 thomaseizinger

On my machine, the CI passes now. I hope I have implemented all your suggestions.

umgefahren avatar Oct 21 '23 16:10 umgefahren

On my machine, the CI passes now. I hope I have implemented all your suggestions.

Can you run cargo clippy --all-features --all-targets with a recent nightly compiler? That should turn up the remaining errors locally :)

thomaseizinger avatar Oct 21 '23 21:10 thomaseizinger

I'm currently writing the actual AutoNATv2 implementation. Quite a big chunk of that is just request_response, but it's currently not possible to get the observed address for an incoming request. I think it would be quite easy to implement that and something we could get into v0.53.

The only downside I see is that the API may become more complex. What do you guys thing?

umgefahren avatar Oct 30 '23 15:10 umgefahren

Quite a big chunk of that is just request_response, but it's currently not possible to get the observed address for an incoming request. I think it would be quite easy to implement that and something we could get into v0.53.

How would you want to expose it?

In general, I consider libp2p-request-response an abstraction aiming to solve simple use-cases. For more complex use-cases, I suggest writing the logic custom from scratch, instead of extending libp2p-request-response and thus making it more complex.

mxinden avatar Oct 30 '23 15:10 mxinden

I have written a small prototype. I will open a PR with it and I would suggest discussing there if the added observed_addr makes it to complex.

I want to add that, at least in this case, request-response will probably cut down on complexity in that code.

This is the first network behavior I implement from scratch and it's really a lot of boilerplate to make it work.

umgefahren avatar Oct 30 '23 15:10 umgefahren

Quite a big chunk of that is just request_response, but it's currently not possible to get the observed address for an incoming request. I think it would be quite easy to implement that and something we could get into v0.53.

How would you want to expose it?

In general, I consider libp2p-request-response an abstraction aiming to solve simple use-cases. For more complex use-cases, I suggest writing the logic custom from scratch, instead of extending libp2p-request-response and thus making it more complex.

Seconding this. I don't want request-response to become a one-size-fits-all solution. The more usecases we load onto it, the more difficult it is to make changes to it.

This is the first network behavior I implement from scratch and it's really a lot of boilerplate to make it work.

It is still a fair bit of boilerplate yeah but shouldn't be too bad actually. My recommendation would be:

  • Set ConnectionHandler::Error to Void
  • Always use ReadyUpgrade
  • Use futures-bounded to drive the streams

Quite a big chunk of that is just request_response

You also MUST use the new allocate_new_port function on DialOpts when you make the dial-back otherwise you might be accidentally hole-punching and report a wrong NAT status. Currently, request-response doesn't expose this or a capability to send a message on a particular connection which is what you'd need if you compose request-response into your behaviour.

thomaseizinger avatar Oct 31 '23 01:10 thomaseizinger

@stormshield-ebzh I hope I corrected your legitimate remark regarding QUIC. Is everything else fine and ready to merge? @thomaseizinger @jxs

umgefahren avatar Jun 04 '24 14:06 umgefahren

This pull request has merge conflicts. Could you please resolve them @umgefahren? 🙏

mergify[bot] avatar Jun 04 '24 14:06 mergify[bot]

I don't quite understand why the CI passes on these problems. Would be great if someone could give me a hint. Especially because it runs on my machine ™️

umgefahren avatar Jun 04 '24 15:06 umgefahren

I don't quite understand why the CI passes on these problems. Would be great if someone could give me a hint. Especially because it runs on my machine ™️

CI always runs on a merge commit with master. If you update this branch with master, you might be able to reproduce them locally.

thomaseizinger avatar Jun 05 '24 00:06 thomaseizinger

This pull request has merge conflicts. Could you please resolve them @umgefahren? 🙏

mergify[bot] avatar Jun 05 '24 16:06 mergify[bot]

I don't quite understand why the CI passes on these problems. Would be great if someone could give me a hint. Especially because it runs on my machine ™️

CI always runs on a merge commit with master. If you update this branch with master, you might be able to reproduce them locally.

The issue persists.

Nevermind I think it was the version.

umgefahren avatar Jun 09 '24 09:06 umgefahren

clippy fails, but this seems to be an upstream issue.

umgefahren avatar Jun 11 '24 05:06 umgefahren

This pull request has merge conflicts. Could you please resolve them @umgefahren? 🙏

mergify[bot] avatar Jun 13 '24 10:06 mergify[bot]

I still don't know why dcutr is failing. I'm not able reproduce the error locally. Not even in a Linux VM 🤷

umgefahren avatar Jun 15 '24 06:06 umgefahren

I still don't know why dcutr is failing. I'm not able reproduce the error locally. Not even in a Linux VM 🤷

Try activate logging for the test so you can see what is happening in CI.

thomaseizinger avatar Jun 15 '24 12:06 thomaseizinger

This pull request has merge conflicts. Could you please resolve them @umgefahren? 🙏

mergify[bot] avatar Jul 08 '24 18:07 mergify[bot]

I'm aware that everyone is waiting for this PR.

I want to give a short update on the progress, the remaining challenges and where help is appreciated.

As you can see all checks are passing, apart from the semver one and crucially the libp2p-dcutr one.

The last remaining challenge lies in the difficulties I have resolving those issues and finding the underlaying bug(s) that make dcutr fail (non deterministically).

I have been working on this bug quite some time now and haven't been able to figure it out. This is partially due to the complexity of the test, involving components I don't fully understand and the fact that debugging this is really difficult, since it doesn't occur in debuggers nor on my native machine, a Mac but only on Linux and there more frequently in virtual machines then on native hardware.

I highly suspect that the issue is related to the "AddrNotAvailable" error one can observe once the debug level of tracing is enabled.

Interestingly this is often originating from the connect call on the libp2p-tcp transport: https://github.com/libp2p/rust-libp2p/blob/4a16b6181d36cd703066a227deff57c6ebe91f7c/transports/tcp/src/lib.rs#L486

I think, that the reason why that causes a problem is that the 4 tuple of:

  • local ip address
  • local port
  • remote ip address
  • remote port

Not unique is. At least that's a discussed reason for that error.

Somewhat unrelated is this snippet wrong too, which I basically ported: https://github.com/libp2p/rust-libp2p/blob/4a16b6181d36cd703066a227deff57c6ebe91f7c/transports/tcp/src/lib.rs#L360-L364 I think the underlying misunderstanding the original author had, is really good described in this StackOverflow answer.

I suspect that this misunderstanding and the subtle differences between Linux and proper BSD-Unix makes the error unreproducible on macOS (and probably iOS, tvOS, watchOS, visionOS, FreeBSD and NetBSD). However I might be wrong here. As this also touches on #3889

I have tried to make it proper (in lokal revisions), but this alone doesn't seem to fix it, but it's not a regression either.

The consequences of our mistake could be diverse and significant:

  • In certain versions of macOS (I read somewhere, that its pre Monterey) and in FreeBSD it's possible for a different user on the same machine to take over this port. I assume this doesn't have any direct security implications because of the existence of noise etc, however it's not good
  • We use wildcard addresses and the implementations in the different operating systems on how to handle these might be part of the problem
  • There seems to be a different flag supported in FreeBSD that enables better handling, through load balancing on FreeBSD. macOS doesn't support this, it's unclear to me why not.
  • If we properly use SO_REUSEPORT on Linux we might even have a slight performance edge.

Also I don't know why we are not tracking if there already exists this 4 tuple for a connection. Probably because we use the unspecified address locally, but that might be a pitfall.

Another possible explanation is that, since we leave it with default TTL and don't set linger options, the connection still exists in the operating system while we consider it available again, thus triggering this error.

I would love to hear some external inputs, pointers and if you can try to reproduce the error on your machine. I think we should take this opportunity to get this right and eliminate an avoidable and annoying error.

umgefahren avatar Jul 27 '24 21:07 umgefahren

Running this on my machine seems to error in about 3 out of 4 runs. I've tried to compare logs between the successful and failing tests, but I haven't observed anything remarkable. The only thing is that the 'successful' test also will actually contain a dial error with the AddrNotAvailable. It seems it's successful if it it dials before another dial call. (Dialing event is shown before the 2nd Binding dial socket to listen socket address.)

For completeness, test results with RUST_LOG=libp2p_tcp,dcutr-test, where I added a log which would include the OutgoingConnectionError with the AddrNotAvailable:

Successful test
$ RUST_BACKTRACE=full RUST_LOG=dcutr-test,libp2p_tcp cargo test --package="libp2p-dcutr" --all-features
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.14s
     Running unittests src/lib.rs (target/debug/deps/libp2p_dcutr-6380256697deab78)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/lib.rs (target/debug/deps/lib-e0e802c5a10c6a9e)

running 1 test
2024-07-29T07:53:02.206675Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:53:02.206732Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=33145
2024-07-29T07:53:02.206970Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:53:02.206995Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=36859
2024-07-29T07:53:02.207085Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:53:02.207109Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=42329
2024-07-29T07:53:02.207298Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:33145
2024-07-29T07:53:02.207321Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/33145/p2p/12D3KooWMPB9ToByj1RLTHFsCmJRdNyWfK6ZN74cwPa1YiFVpcHR
2024-07-29T07:53:02.259610Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/36859 local_address=/ip4/127.0.0.1/tcp/33145
2024-07-29T07:53:02.464967Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:33145
2024-07-29T07:53:02.465031Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/33145/p2p/12D3KooWMPB9ToByj1RLTHFsCmJRdNyWfK6ZN74cwPa1YiFVpcHR
2024-07-29T07:53:02.465319Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/42329 local_address=/ip4/127.0.0.1/tcp/33145
2024-07-29T07:53:03.007149Z  INFO dcutr-test: event: Behaviour(ClientEvent: Sent { connection_id: ConnectionId(3), peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93") })
2024-07-29T07:53:03.091147Z  INFO dcutr-test: event: Behaviour(ClientEvent: Received { connection_id: ConnectionId(3), peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"), info: Info { public_key: PublicKey { publickey: Ed25519(PublicKey(compressed): 9843dbde8d77eac176c6e123ec84c9cc57ceb4125cac1f1efa86d8d10a7de) }, protocol_version: "/client", agent_version: "rust-libp2p/0.45.0", listen_addrs: ["/memory/14272431799217384301", "/ip4/127.0.0.1/tcp/36859", "/ip4/127.0.0.1/tcp/33145/p2p/12D3KooWMPB9ToByj1RLTHFsCmJRdNyWfK6ZN74cwPa1YiFVpcHR/p2p-circuit/p2p/12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"], protocols: ["/ipfs/id/1.0.0", "/ipfs/id/push/1.0.0"], observed_addr: "/p2p/12D3KooWGdjuqHU9KmJsKjtwEZmeo6zoQGjRpMRvEZE76cYzmnaz" } })
2024-07-29T07:53:03.091298Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"), address: "/memory/14272431799217384301" }
2024-07-29T07:53:03.091341Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"), address: "/ip4/127.0.0.1/tcp/36859" }
2024-07-29T07:53:03.091380Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"), address: "/ip4/127.0.0.1/tcp/33145/p2p/12D3KooWMPB9ToByj1RLTHFsCmJRdNyWfK6ZN74cwPa1YiFVpcHR/p2p-circuit/p2p/12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93" }
2024-07-29T07:53:03.091452Z  INFO dcutr-test: event: NewExternalAddrCandidate { address: "/p2p/12D3KooWGdjuqHU9KmJsKjtwEZmeo6zoQGjRpMRvEZE76cYzmnaz" }
2024-07-29T07:53:03.173682Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:36859
2024-07-29T07:53:03.173761Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/36859/p2p/12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93
2024-07-29T07:53:03.173913Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:42329
2024-07-29T07:53:03.173893Z  INFO dcutr-test: event: Dialing { peer_id: Some(PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93")), connection_id: ConnectionId(7) }
2024-07-29T07:53:03.173963Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/42329/p2p/12D3KooWGdjuqHU9KmJsKjtwEZmeo6zoQGjRpMRvEZE76cYzmnaz
2024-07-29T07:53:03.174106Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/42329 local_address=/ip4/127.0.0.1/tcp/36859
2024-07-29T07:53:03.216092Z  INFO dcutr-test: event: ConnectionEstablished { peer_id: PeerId("12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93"), connection_id: ConnectionId(7), endpoint: Dialer { address: "/ip4/127.0.0.1/tcp/36859/p2p/12D3KooWL4k9rwgK1f89KRCKGNf4awxQgnNNvd36GxTxK2MXgn93", role_override: Dialer, port_use: Reuse }, num_established: 2, concurrent_dial_errors: Some([]), established_in: 42.021384ms }
2024-07-29T07:53:03.216225Z TRACE libp2p_tcp: Unregistering for port reuse ip=127.0.0.1 port=42329
test connect ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 1.02s

   Doc-tests libp2p_dcutr

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

and

Failing test
$ RUST_BACKTRACE=full RUST_LOG=dcutr-test,libp2p_tcp cargo test --package="libp2p-dcutr" --all-features
   Compiling libp2p-dcutr v0.12.0 (/home/src/rust-libp2p/protocols/dcutr)
    Finished `test` profile [unoptimized + debuginfo] target(s) in 1.05s
     Running unittests src/lib.rs (target/debug/deps/libp2p_dcutr-6380256697deab78)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running tests/lib.rs (target/debug/deps/lib-e0e802c5a10c6a9e)

running 1 test
2024-07-29T07:51:22.472537Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:51:22.472621Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=40851
2024-07-29T07:51:22.473054Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:51:22.473103Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=38211
2024-07-29T07:51:22.473272Z DEBUG libp2p_tcp: listening on 127.0.0.1:0
2024-07-29T07:51:22.473319Z TRACE libp2p_tcp: Registering for port reuse ip=127.0.0.1 port=38913
2024-07-29T07:51:22.473665Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:40851
2024-07-29T07:51:22.473708Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/40851/p2p/12D3KooWPTvWXaTkty8A74LuDV66t3B8ZpMmFSpt3oMaSj5DbGeB
2024-07-29T07:51:22.519609Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/38211 local_address=/ip4/127.0.0.1/tcp/40851
2024-07-29T07:51:22.724706Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:40851
2024-07-29T07:51:22.724767Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/40851/p2p/12D3KooWPTvWXaTkty8A74LuDV66t3B8ZpMmFSpt3oMaSj5DbGeB
2024-07-29T07:51:22.725101Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/38913 local_address=/ip4/127.0.0.1/tcp/40851
2024-07-29T07:51:23.264389Z  INFO dcutr-test: event: Behaviour(ClientEvent: Sent { connection_id: ConnectionId(3), peer_id: PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM") })
2024-07-29T07:51:23.347519Z  INFO dcutr-test: event: Behaviour(ClientEvent: Received { connection_id: ConnectionId(3), peer_id: PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM"), info: Info { public_key: PublicKey { publickey: Ed25519(PublicKey(compressed): 2d1d628d32b8937879b3cdb2b19c4d6d4383ce41730a3686545ea86703bf4f0) }, protocol_version: "/client", agent_version: "rust-libp2p/0.45.0", listen_addrs: ["/memory/9209890263252963938", "/ip4/127.0.0.1/tcp/38211", "/ip4/127.0.0.1/tcp/40851/p2p/12D3KooWPTvWXaTkty8A74LuDV66t3B8ZpMmFSpt3oMaSj5DbGeB/p2p-circuit/p2p/12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM"], protocols: ["/ipfs/id/1.0.0", "/ipfs/id/push/1.0.0"], observed_addr: "/p2p/12D3KooWBipzCrV6YuJu5UFB31jShEcCitWLT8b48FZR2s2dqK3s" } })
2024-07-29T07:51:23.347669Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM"), address: "/memory/9209890263252963938" }
2024-07-29T07:51:23.347713Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM"), address: "/ip4/127.0.0.1/tcp/38211" }
2024-07-29T07:51:23.347750Z  INFO dcutr-test: event: NewExternalAddrOfPeer { peer_id: PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM"), address: "/ip4/127.0.0.1/tcp/40851/p2p/12D3KooWPTvWXaTkty8A74LuDV66t3B8ZpMmFSpt3oMaSj5DbGeB/p2p-circuit/p2p/12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM" }
2024-07-29T07:51:23.347823Z  INFO dcutr-test: event: NewExternalAddrCandidate { address: "/p2p/12D3KooWBipzCrV6YuJu5UFB31jShEcCitWLT8b48FZR2s2dqK3s" }
2024-07-29T07:51:23.430901Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:38913
2024-07-29T07:51:23.430986Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/38913/p2p/12D3KooWBipzCrV6YuJu5UFB31jShEcCitWLT8b48FZR2s2dqK3s
2024-07-29T07:51:23.431124Z DEBUG libp2p_tcp: dialing address address=127.0.0.1:38211
2024-07-29T07:51:23.431179Z TRACE libp2p_tcp: Binding dial socket to listen socket address address=/ip4/127.0.0.1/tcp/38211/p2p/12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM
2024-07-29T07:51:23.431287Z  INFO dcutr-test: event: Dialing { peer_id: Some(PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM")), connection_id: ConnectionId(8) }
2024-07-29T07:51:23.431417Z DEBUG Transport::poll: libp2p_tcp: Incoming connection from remote at local remote_address=/ip4/127.0.0.1/tcp/38211 local_address=/ip4/127.0.0.1/tcp/38913
2024-07-29T07:51:23.431522Z  INFO dcutr-test: event: IncomingConnection { connection_id: ConnectionId(9), local_addr: "/ip4/127.0.0.1/tcp/38913", send_back_addr: "/ip4/127.0.0.1/tcp/38211" }
2024-07-29T07:51:23.431616Z  INFO dcutr-test: event: OutgoingConnectionError { connection_id: ConnectionId(8), peer_id: Some(PeerId("12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM")), error: Transport([("/ip4/127.0.0.1/tcp/38211/p2p/12D3KooWCrUW9m7Y4BQhCYjHkmezXjyXN1qnbmoxj9A1y3a2GKXM", Other(Custom { kind: Other, error: Left(Left(Right(Os { code: 99, kind: AddrNotAvailable, message: "Cannot assign requested address" }))) }))]) }
2024-07-29T07:51:33.438078Z TRACE libp2p_tcp: Unregistering for port reuse ip=127.0.0.1 port=38913
test connect ... FAILED

failures:

---- connect stdout ----
thread 'connect' panicked at /home/src/rust-libp2p/swarm-test/src/lib.rs:318:38:
Swarm did not emit an event within 10s
stack backtrace:
   0:     0x555556105825 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h3692694645b1bb6a
   1:     0x555556132d7b - core::fmt::write::h5131d80b4c69b88d
   2:     0x5555561025cf - std::io::Write::write_fmt::h8142063d64ec1ffc
   3:     0x5555561055fe - std::sys_common::backtrace::print::h998d75b840f75a73
   4:     0x555556107069 - std::panicking::default_hook::{{closure}}::h18ec7fe6a38b9da0
   5:     0x555556106cd6 - std::panicking::default_hook::hfb3f22c2e4075a6a
   6:     0x555555b1d9fa - test::test_main::{{closure}}::h238c9a9a662cfeed
   7:     0x5555561076db - std::panicking::rust_panic_with_hook::h51af00bcb4660c4e
   8:     0x55555610741b - std::panicking::begin_panic_handler::{{closure}}::h39f76aa863fbe8ce
   9:     0x555556105ce9 - std::sys_common::backtrace::__rust_end_short_backtrace::h4d10fc2251b89840
  10:     0x555556107187 - rust_begin_unwind
  11:     0x555556130603 - core::panicking::panic_fmt::h319840fcbcd912ef
  12:     0x5555559e59a3 - <libp2p_swarm::Swarm<B> as libp2p_swarm_test::SwarmExt>::next_swarm_event::{{closure}}::h194cc518d8c0aaa4
  13:     0x555555ae2b08 - <core::pin::Pin<P> as core::future::future::Future>::poll::h482b70b9a1a07efa
  14:     0x5555559e7131 - <libp2p_swarm::Swarm<B> as libp2p_swarm_test::SwarmExt>::wait::{{closure}}::h4fe547bc70a3c169
  15:     0x555555ae2df7 - <core::pin::Pin<P> as core::future::future::Future>::poll::hd86bd9b5474db10d
  16:     0x555555a8e5a5 - lib::connect::{{closure}}::h6d2492d96ce0d5e8
  17:     0x555555a44838 - <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll::{{closure}}::h40be3693af1ae1d6
  18:     0x55555599c86c - async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::{{closure}}::h269000f9f03f85be
  19:     0x555555a2047c - std::thread::local::LocalKey<T>::try_with::h0878c215a08025cc
  20:     0x555555a202e6 - std::thread::local::LocalKey<T>::with::h699163a3713234c0
  21:     0x55555599c7be - async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::hca6735c7b35398fe
  22:     0x555555a447fa - <async_std::task::builder::SupportTaskLocals<F> as core::future::future::Future>::poll::he4adc18c948fb0f1
  23:     0x555555906d8e - <futures_lite::future::Or<F1,F2> as core::future::future::Future>::poll::h1d4c65338d6234c1
  24:     0x555555a21947 - async_executor::Executor::run::{{closure}}::h3e899f52b2ac835c
  25:     0x555555a21567 - async_executor::LocalExecutor::run::{{closure}}::h9b662a8dccda7d7d
  26:     0x55555599fa2b - async_io::driver::block_on::h099c3006222fc3c1
  27:     0x555555a28e9e - async_global_executor::reactor::block_on::{{closure}}::h2a1526a2a02d8162
  28:     0x555555a28e1e - async_global_executor::reactor::block_on::hdeded90a90d87826
  29:     0x555555904ffc - async_global_executor::executor::block_on::{{closure}}::hbd510c998130a868
  30:     0x555555a20594 - std::thread::local::LocalKey<T>::try_with::h0e124ab0bb7e8f5b
  31:     0x555555a202b6 - std::thread::local::LocalKey<T>::with::h59b24abc49843f52
  32:     0x555555904f39 - async_global_executor::executor::block_on::hdc276b259ae0c598
  33:     0x555555a44edd - async_std::task::builder::Builder::blocking::{{closure}}::{{closure}}::h3a51054b3896d50a
  34:     0x55555599c999 - async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::{{closure}}::hecb3fa1857cf48e7
  35:     0x555555a20986 - std::thread::local::LocalKey<T>::try_with::hba7b1c66460ad013
  36:     0x555555a20376 - std::thread::local::LocalKey<T>::with::ha665f49127daeb18
  37:     0x55555599c779 - async_std::task::task_locals_wrapper::TaskLocalsWrapper::set_current::h34e0eabedb8eccb4
  38:     0x555555a44e17 - async_std::task::builder::Builder::blocking::{{closure}}::hecacb01ad05ffd34
  39:     0x555555a20794 - std::thread::local::LocalKey<T>::try_with::h72a63cff613f1e2f
  40:     0x555555a20256 - std::thread::local::LocalKey<T>::with::h2ac03beeacdc4cdd
  41:     0x555555a44b29 - async_std::task::builder::Builder::blocking::hb3a780623e67ff9a
  42:     0x55555592421b - async_std::task::block_on::block_on::hfe08d2703720910a
  43:     0x555555a05c06 - lib::connect::h7f98f7fe9592c247
  44:     0x555555a8d2d3 - lib::connect::{{closure}}::h49b3ecedd8518355
  45:     0x55555590acc6 - core::ops::function::FnOnce::call_once::haf042cffdad8421a
  46:     0x555555b2224b - test::__rust_begin_short_backtrace::hae44b86aaeda3a0f
  47:     0x555555b219a1 - test::run_test::{{closure}}::h4148d3c10c6c98f1
  48:     0x555555ae95a4 - std::sys_common::backtrace::__rust_begin_short_backtrace::h91e427ea6df4cf63
  49:     0x555555aedf32 - core::ops::function::FnOnce::call_once{{vtable.shim}}::had7513e31342a613
  50:     0x55555610bafb - std::sys::pal::unix::thread::Thread::new::thread_start::h3b8e81128811868f
  51:     0x7ffff7d58272 - start_thread
  52:     0x7ffff7dd3dec - __GI___clone3
  53:                0x0 - <unknown>


failures:
    connect

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out; finished in 10.97s

error: test failed, to rerun pass `-p libp2p-dcutr --test lib

b-zee avatar Jul 29 '24 13:07 b-zee

For completeness, test results with RUST_LOG=libp2p_tcp,dcutr-test, where I added a log which would include the OutgoingConnectionError with the AddrNotAvailable:

Trying adding libp2p_swarm_test=trace to see logs of the test harness. Maybe there is a race condition in how the events are processed.

thomaseizinger avatar Jul 30 '24 01:07 thomaseizinger