rust-libp2p icon indicating copy to clipboard operation
rust-libp2p copied to clipboard

Quic attempt 4

Open kpp opened this issue 3 years ago • 10 comments

Description

Yep... This time it's written over quinn, not quinn-proto. Feature: implemented with 600 lines of code in quic/src/lib.rs.

Links to any relevant issues

Attempt N3: https://github.com/libp2p/rust-libp2p/pull/2289

Perf: https://github.com/kpp/libp2p-perf/tree/quic-attempt-4

Open Questions

  1. It uses quin#named-futures, which is outdated. So right now only tokio runtime is supported.
  2. We need to deal with AddressChange somehow. I have no idea how to do that.
  3. There are a lot of unwraps. I need to clean up the code a little bit.

Benchmarks:

QUIC with libp2p-perf
$ ./run.sh 
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".

# Rust -> Rust

Local peer id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(10.001692342s, 4396669396))
Interval	Transfer	Bandwidth
0 s - 10.00 s	4396 MBytes	3516.68 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Rust -> Golang

Local peer id: PeerId("12D3KooWLy9RkY4uW26ryp3seZ6QXuiJbmBywVE359MnQopnYEj2")
Interval	Transfer	Bandwidth
0 s - 10.00 s	3062 MBytes	2449.60 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }


# Golang -> Rust

2022/08/05 15:10:43 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWKGLyqMFxSXgqv8n1mwX3Ljxg18ENFTtuiHTRYE8DHxrv"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval 	Transfer	Bandwidth
0s - 10.00 s 	2160 MBytes	1727.97 MBit/s
Behaviour(PerfRunDone(9.999945043s, 2160640000))
ConnectionClosed { peer_id: PeerId("12D3KooWKGLyqMFxSXgqv8n1mwX3Ljxg18ENFTtuiHTRYE8DHxrv"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Golang -> Golang

2022/08/05 15:10:53 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval 	Transfer	Bandwidth
0s - 10.00 s 	2085 MBytes	1667.91 MBit/s
TCP with libp2p-perf
   Compiling libp2p-core v0.32.1
   Compiling libp2p-noise v0.35.0
   Compiling libp2p-plaintext v0.32.0
   Compiling netlink-proto v0.10.0
   Compiling rtnetlink v0.10.1
   Compiling if-watch v1.1.1
   Compiling ed25519-dalek v1.0.1
   Compiling toml v0.5.9
   Compiling proc-macro-crate v1.1.3
   Compiling multihash-derive v0.8.0
   Compiling multihash v0.16.2
   Compiling multiaddr v0.14.0
   Compiling libp2p-swarm v0.35.0
   Compiling libp2p-dns v0.32.1
   Compiling libp2p-tcp v0.32.0
   Compiling libp2p-yamux v0.36.0
   Compiling libp2p v0.44.0
   Compiling libp2p-perf v0.1.0 (/home/kpp/parity/libp2p-perf/rust)
    Finished release [optimized + debuginfo] target(s) in 37.42s
kpp@xps:~/parity/libp2p-perf$ ./run.sh 
# Start Rust and Golang servers.

# Rust -> Rust

## Transport security noise
Interval	Transfer	Bandwidth
0 s - 10.00 s	8201 MBytes	6560.56 MBit/s

## Transport security plaintext
Interval	Transfer	Bandwidth
0 s - 10.00 s	14598 MBytes	11678.23 MBit/s

# Rust -> Golang

## Transport security noise
Interval	Transfer	Bandwidth
0 s - 10.00 s	6832 MBytes	5465.37 MBit/s

## Transport security plaintext
Interval	Transfer	Bandwidth
0 s - 10.00 s	12462 MBytes	9969.54 MBit/s

# Golang -> Rust

## Transport security noise
Interval 	Transfer	Bandwidth
0s - 10.00 s 	9636 MBytes	7707.60 MBit/s

## Transport security plaintext
Interval 	Transfer	Bandwidth
0s - 10.00 s 	20705 MBytes	16563.89 MBit/s

# Golang -> Golang

## Transport security noise
Interval 	Transfer	Bandwidth
0s - 10.00 s 	5741 MBytes	4592.47 MBit/s

## Transport security plaintext
Interval 	Transfer	Bandwidth
0s - 10.00 s 	21973 MBytes	17578.36 MBit/s

kpp avatar Aug 05 '22 12:08 kpp

@kpp any chance you can also post benchmarks for tcp on the same setup? would be interesting to see how things compare

dignifiedquire avatar Aug 05 '22 12:08 dignifiedquire

Sure. Posted in the PR description.

kpp avatar Aug 05 '22 12:08 kpp

@kpp any chance you can also post benchmarks for tcp on the same setup? would be interesting to see how things compare

Adding to this: do you have benchmarks from the latest implementation in #2289 for comparison?


As discussed in https://github.com/libp2p/rust-libp2p/pull/2289#issuecomment-1143614039, we have thus far not switched to using quinn (instead of quinn-proto) because of their usage of unbounded channels and overhead at spawning tasks. Both will be fixed with https://github.com/quinn-rs/quinn/pull/1219, but this PR is not merged yet and has actually been stale for a couple of months now. I am generally in favor of using quinn, but until they publish a new version that includes that change we won't be able to merge an implementation that relies on this (and this PR directly depends on https://github.com/quinn-rs/quinn/pull/1357, which is not merged yet as well). I am undecided here. I don't want to block on https://github.com/quinn-rs/quinn/pull/1219 or https://github.com/quinn-rs/quinn/pull/1357, which is why imo it makes more sense to first focus on finishing #2289, and switch over at a later point. That being said, I definitely like how using quinn saves a lot of complexity and simplifies the integration. Maybe we could try to collaborate with the quinn maintainers to get the mentioned PRs merged and a new version published (provided that the benchmarks do not indicate any performance hit when using quinn).

elenaf9 avatar Aug 05 '22 13:08 elenaf9

The quinn PRs cited above are mainly blocked on feedback. If they're useful to you, please post on them! I'm particularly interested in hearing which futures are useful to have nameable, and which ones should be 'static. If you have measured cases where https://github.com/quinn-rs/quinn/pull/1219 significantly improves performance/reduces resource usage, that would also be very helpful to know about.

In the meantime, I've rebased https://github.com/quinn-rs/quinn/pull/1357 to expose the latest features from main.

Ralith avatar Aug 28 '22 02:08 Ralith

If they're useful to you, please post on them! I'm particularly interested in hearing which futures are useful to have nameable

@Ralith They are! I use OpenBi only which is very very useful.

kpp avatar Sep 12 '22 14:09 kpp

@Ralith I merged unify-drivers into named-futures (https://github.com/kpp/quinn/tree/named_unify_drivers) and benchmarked it.

https://github.com/kpp/rust-libp2p/?branch=quic-attempt-4#03ad4570
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".

# Rust -> Rust

Local peer id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(9.999420281s, 3016841873))
Interval	Transfer	Bandwidth
0 s - 10.00 s	3016 MBytes	2412.68 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Rust -> Golang

Local peer id: PeerId("12D3KooWSvBP6ZX34qUPQr25CaMoPevAPpjVHi3vU9urx1nGntKG")
Interval	Transfer	Bandwidth
0 s - 10.00 s	1754 MBytes	1403.03 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }


# Golang -> Rust

2022/09/13 00:47:10 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWRw1jhFWSCdgRxaojTwDdf7WDbNR5mqReU34km96tWY5p"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval 	Transfer	Bandwidth
0s - 10.00 s 	1353 MBytes	1082.32 MBit/s
Behaviour(PerfRunDone(10.000838344s, 1353472000))
ConnectionClosed { peer_id: PeerId("12D3KooWRw1jhFWSCdgRxaojTwDdf7WDbNR5mqReU34km96tWY5p"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Golang -> Golang

2022/09/13 00:47:20 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval 	Transfer	Bandwidth
0s - 10.00 s 	1609 MBytes	1287.12 MBit/s
https://github.com/kpp/quinn/tree/named_unify_drivers
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".

# Rust -> Rust

Local peer id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(10.003066501s, 2392703701))
Interval	Transfer	Bandwidth
0 s - 10.00 s	2392 MBytes	1913.57 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Rust -> Golang

Local peer id: PeerId("12D3KooWMC4uryeh4ZFyoPekrCq3bd93g3oykpE2AnjYHxiKeMsy")
Interval	Transfer	Bandwidth
0 s - 10.00 s	1896 MBytes	1516.70 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }


# Golang -> Rust

2022/09/13 00:52:31 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWN5vYpKMmf37UMhAAJqRWTcmtsFa31t1L9zYyZSTNgu2k"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval 	Transfer	Bandwidth
0s - 10.00 s 	1388 MBytes	1110.32 MBit/s
Behaviour(PerfRunDone(10.000623734s, 1388416000))
ConnectionClosed { peer_id: PeerId("12D3KooWN5vYpKMmf37UMhAAJqRWTcmtsFa31t1L9zYyZSTNgu2k"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }

# Golang -> Golang

2022/09/13 00:52:41 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval 	Transfer	Bandwidth
0s - 10.00 s 	1439 MBytes	1151.17 MBit/s

I see a performance degradation against https://github.com/kpp/rust-libp2p/?branch=quic-attempt-4#03ad4570 which is based on my own rebased named-futures onto main.

kpp avatar Sep 12 '22 22:09 kpp

Thanks! Details about the test setup (amount of concurrent activity, I/O pattern, number of threads) would be interesting. Since GATs are in FCP, I'm planning to moved named-futures forwards without 'static and get it merged, along with some of the minor cleanup from the unify-drivers branch, which should make it easier to investigate soon.

Ralith avatar Sep 12 '22 22:09 Ralith

Those benchmarks are in https://github.com/kpp/rust-libp2p/tree/quic-attempt-4, IO pattern is: send as much data from 1 client to a server for 10 seconds and measure the bandwidth.

kpp avatar Sep 12 '22 23:09 kpp

Those benchmarks are in https://github.com/kpp/rust-libp2p/tree/quic-attempt-4

It's not immediately obvious to me where I should be looking.

IO pattern is: send as much data from 1 client to a server for 10 seconds and measure the bandwidth.

Using a single stream?

Ralith avatar Sep 13 '22 00:09 Ralith

It's not immediately obvious to me where I should be looking.

Sorry, this is the link: https://github.com/kpp/libp2p-perf/tree/quic-attempt-4. It's better look at golang/main.go implementation to get an understanding what's going on. The Rust code does the same but with x10 boilerplate.

Using a single stream?

Yes

kpp avatar Sep 15 '22 09:09 kpp

Hey, let me drop some related info that might help. So, after months working on my own implementation of QUIC/TLS13 for Rust I came across this https://github.com/aws/s2n-quic repository. The design looks pretty much identical to my own concept thus I believe they took the correct path and it's worth checking. The overall interface approache is pretty much how things should be designed in Rust.

xpepermint avatar Sep 27 '22 12:09 xpepermint

Hey, thanks. I saw their project and it worth trying to implement libp2p-quic over s2n-quic to benchmark the solutions.

kpp avatar Sep 27 '22 12:09 kpp

A comparative analysis should probably also include Cloudflare's and Mozilla's implementations. Rust has a lot of QUICs these days!

Ralith avatar Sep 27 '22 17:09 Ralith

Yeah, Mozilla's implementation (Neqo) looks great and the source code is probably the most optimal, but unfortunately they don't have any plan of releasing the crates https://github.com/mozilla/neqo/issues/1299 and I think they pretty much decided what async runtime they want to support https://github.com/mozilla/neqo/issues/530. Maybe this it's a big deal after all, but I just share it here.

xpepermint avatar Sep 27 '22 20:09 xpepermint

Let me know if you need any assistance in using or benchmarking s2n-quic.

camshaft avatar Oct 15 '22 19:10 camshaft

Re: boxing and cancellation-safety: now that GATs are possible, traits can include lifetime-bearing associated futures, so it's possible to abstract over futures like Accept without boxing. That said, a single malloc there isn't likely to be measurable, so you're probably fine either way.

Ralith avatar Nov 04 '22 04:11 Ralith

Re: boxing and cancellation-safety: now that GATs are possible, traits can include lifetime-bearing associated futures, so it's possible to abstract over futures like Accept without boxing. That said, a single malloc there isn't likely to be measurable, so you're probably fine either way.

I think the issue here is more that we would need self-referential structs because we would store the future in the same type as Endpoint. I don't think GATs can help us there?

thomaseizinger avatar Nov 04 '22 13:11 thomaseizinger

The associated types can be !Unpin! pin-project-lite makes working with this type of thing very easy, as demonstrated by quinn's own wrapping of Notified.

Ralith avatar Nov 04 '22 17:11 Ralith

The associated types can be Unpin! pin-project-lite makes working with this type of thing very easy, as demonstrated by quinn's own wrapping of Notified.

Unless I am missing something, I don't think that would work in our case? Listener can't have a lifetime because it owns Endpoint, yet we would have to store Accept in the same struct to poll for the next connection. Ideally, we would have an Endpoint::poll_accept function that we can call from the Stream implementation.

thomaseizinger avatar Nov 07 '22 23:11 thomaseizinger

You could refactor the traits that must be implemented to propagate !Unpin futures upwards rather than requiring implementations to erase them behind a poll-style API. The extra indirection can't be avoided if you want to preserve exactly the same API, but it's worth considering an API that constrains implementations less, though probably as future work.

Ralith avatar Nov 07 '22 23:11 Ralith

You could refactor the traits that must be implemented to propagate !Unpin futures upwards rather than requiring implementations to erase them behind a poll-style API. The extra indirection can't be avoided if you want to preserve exactly the same API, but it's worth considering an API that constrains implementations less, though probably as future work.

Thanks! Most of our APIs are designed around poll interfaces for various reasons so that is unlikely to change anytime soon.

thomaseizinger avatar Nov 08 '22 02:11 thomaseizinger