Quic attempt 4
Description
Yep... This time it's written over quinn, not quinn-proto. Feature: implemented with 600 lines of code in quic/src/lib.rs.
Links to any relevant issues
Attempt N3: https://github.com/libp2p/rust-libp2p/pull/2289
Perf: https://github.com/kpp/libp2p-perf/tree/quic-attempt-4
Open Questions
- It uses quin#named-futures, which is outdated. So right now only tokio runtime is supported.
- We need to deal with AddressChange somehow. I have no idea how to do that.
- There are a lot of unwraps. I need to clean up the code a little bit.
Benchmarks:
QUIC with libp2p-perf
$ ./run.sh
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".
# Rust -> Rust
Local peer id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(10.001692342s, 4396669396))
Interval Transfer Bandwidth
0 s - 10.00 s 4396 MBytes 3516.68 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWH1oPSwuvp5v5AUqvGT3HR64YZRC7VZmhVZkewiLcPDpa"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54502/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Rust -> Golang
Local peer id: PeerId("12D3KooWLy9RkY4uW26ryp3seZ6QXuiJbmBywVE359MnQopnYEj2")
Interval Transfer Bandwidth
0 s - 10.00 s 3062 MBytes 2449.60 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }
# Golang -> Rust
2022/08/05 15:10:43 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWKGLyqMFxSXgqv8n1mwX3Ljxg18ENFTtuiHTRYE8DHxrv"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval Transfer Bandwidth
0s - 10.00 s 2160 MBytes 1727.97 MBit/s
Behaviour(PerfRunDone(9.999945043s, 2160640000))
ConnectionClosed { peer_id: PeerId("12D3KooWKGLyqMFxSXgqv8n1mwX3Ljxg18ENFTtuiHTRYE8DHxrv"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/54359/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Golang -> Golang
2022/08/05 15:10:53 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval Transfer Bandwidth
0s - 10.00 s 2085 MBytes 1667.91 MBit/s
TCP with libp2p-perf
Compiling libp2p-core v0.32.1
Compiling libp2p-noise v0.35.0
Compiling libp2p-plaintext v0.32.0
Compiling netlink-proto v0.10.0
Compiling rtnetlink v0.10.1
Compiling if-watch v1.1.1
Compiling ed25519-dalek v1.0.1
Compiling toml v0.5.9
Compiling proc-macro-crate v1.1.3
Compiling multihash-derive v0.8.0
Compiling multihash v0.16.2
Compiling multiaddr v0.14.0
Compiling libp2p-swarm v0.35.0
Compiling libp2p-dns v0.32.1
Compiling libp2p-tcp v0.32.0
Compiling libp2p-yamux v0.36.0
Compiling libp2p v0.44.0
Compiling libp2p-perf v0.1.0 (/home/kpp/parity/libp2p-perf/rust)
Finished release [optimized + debuginfo] target(s) in 37.42s
kpp@xps:~/parity/libp2p-perf$ ./run.sh
# Start Rust and Golang servers.
# Rust -> Rust
## Transport security noise
Interval Transfer Bandwidth
0 s - 10.00 s 8201 MBytes 6560.56 MBit/s
## Transport security plaintext
Interval Transfer Bandwidth
0 s - 10.00 s 14598 MBytes 11678.23 MBit/s
# Rust -> Golang
## Transport security noise
Interval Transfer Bandwidth
0 s - 10.00 s 6832 MBytes 5465.37 MBit/s
## Transport security plaintext
Interval Transfer Bandwidth
0 s - 10.00 s 12462 MBytes 9969.54 MBit/s
# Golang -> Rust
## Transport security noise
Interval Transfer Bandwidth
0s - 10.00 s 9636 MBytes 7707.60 MBit/s
## Transport security plaintext
Interval Transfer Bandwidth
0s - 10.00 s 20705 MBytes 16563.89 MBit/s
# Golang -> Golang
## Transport security noise
Interval Transfer Bandwidth
0s - 10.00 s 5741 MBytes 4592.47 MBit/s
## Transport security plaintext
Interval Transfer Bandwidth
0s - 10.00 s 21973 MBytes 17578.36 MBit/s
@kpp any chance you can also post benchmarks for tcp on the same setup? would be interesting to see how things compare
Sure. Posted in the PR description.
@kpp any chance you can also post benchmarks for tcp on the same setup? would be interesting to see how things compare
Adding to this: do you have benchmarks from the latest implementation in #2289 for comparison?
As discussed in https://github.com/libp2p/rust-libp2p/pull/2289#issuecomment-1143614039, we have thus far not switched to using quinn (instead of quinn-proto) because of their usage of unbounded channels and overhead at spawning tasks. Both will be fixed with https://github.com/quinn-rs/quinn/pull/1219, but this PR is not merged yet and has actually been stale for a couple of months now.
I am generally in favor of using quinn, but until they publish a new version that includes that change we won't be able to merge an implementation that relies on this (and this PR directly depends on https://github.com/quinn-rs/quinn/pull/1357, which is not merged yet as well).
I am undecided here. I don't want to block on https://github.com/quinn-rs/quinn/pull/1219 or https://github.com/quinn-rs/quinn/pull/1357, which is why imo it makes more sense to first focus on finishing #2289, and switch over at a later point. That being said, I definitely like how using quinn saves a lot of complexity and simplifies the integration. Maybe we could try to collaborate with the quinn maintainers to get the mentioned PRs merged and a new version published (provided that the benchmarks do not indicate any performance hit when using quinn).
The quinn PRs cited above are mainly blocked on feedback. If they're useful to you, please post on them! I'm particularly interested in hearing which futures are useful to have nameable, and which ones should be 'static. If you have measured cases where https://github.com/quinn-rs/quinn/pull/1219 significantly improves performance/reduces resource usage, that would also be very helpful to know about.
In the meantime, I've rebased https://github.com/quinn-rs/quinn/pull/1357 to expose the latest features from main.
If they're useful to you, please post on them! I'm particularly interested in hearing which futures are useful to have nameable
@Ralith They are! I use OpenBi only which is very very useful.
@Ralith I merged unify-drivers into named-futures (https://github.com/kpp/quinn/tree/named_unify_drivers) and benchmarked it.
https://github.com/kpp/rust-libp2p/?branch=quic-attempt-4#03ad4570
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".
# Rust -> Rust
Local peer id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(9.999420281s, 3016841873))
Interval Transfer Bandwidth
0 s - 10.00 s 3016 MBytes 2412.68 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWJv92fE9vbhoofFjqJYyGu5Qk4CSKCKeeJgARGyN3FYTu"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/42633/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Rust -> Golang
Local peer id: PeerId("12D3KooWSvBP6ZX34qUPQr25CaMoPevAPpjVHi3vU9urx1nGntKG")
Interval Transfer Bandwidth
0 s - 10.00 s 1754 MBytes 1403.03 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }
# Golang -> Rust
2022/09/13 00:47:10 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWRw1jhFWSCdgRxaojTwDdf7WDbNR5mqReU34km96tWY5p"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval Transfer Bandwidth
0s - 10.00 s 1353 MBytes 1082.32 MBit/s
Behaviour(PerfRunDone(10.000838344s, 1353472000))
ConnectionClosed { peer_id: PeerId("12D3KooWRw1jhFWSCdgRxaojTwDdf7WDbNR5mqReU34km96tWY5p"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/40407/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Golang -> Golang
2022/09/13 00:47:20 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval Transfer Bandwidth
0s - 10.00 s 1609 MBytes 1287.12 MBit/s
https://github.com/kpp/quinn/tree/named_unify_drivers
# Start Rust and Golang servers.
Local peer id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw")
about to listen on "/ip4/127.0.0.1/udp/9992/quic"
Listening on "/ip4/127.0.0.1/udp/9992/quic".
# Rust -> Rust
Local peer id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7")
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }, num_established: 1, concurrent_dial_errors: None }
Behaviour(PerfRunDone(10.003066501s, 2392703701))
Interval Transfer Bandwidth
0 s - 10.00 s 2392 MBytes 1913.57 MBit/s
ConnectionClosed { peer_id: PeerId("Qmcqq9TFaYbb94uwdER1BXyGfCFY4Bb1gKozxNyVvLvTSw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9992/quic", role_override: Dialer }, num_established: 0, cause: None }
ConnectionClosed { peer_id: PeerId("12D3KooWRKpJTMAZ7C9VUKFUV3qMgS5oYtxJxoA2e9doJWF5PNy7"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/57680/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Rust -> Golang
Local peer id: PeerId("12D3KooWMC4uryeh4ZFyoPekrCq3bd93g3oykpE2AnjYHxiKeMsy")
Interval Transfer Bandwidth
0 s - 10.00 s 1896 MBytes 1516.70 MBit/s
ConnectionClosed { peer_id: PeerId("12D3KooWL3XJ9EMCyZvmmGXL2LMiVBtrVa2BuESsJiXkSj7333Jw"), endpoint: Dialer { address: "/ip4/127.0.0.1/udp/9993/quic", role_override: Dialer }, num_established: 0, cause: None }
# Golang -> Rust
2022/09/13 00:52:31 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
IncomingConnection { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }
ConnectionEstablished { peer_id: PeerId("12D3KooWN5vYpKMmf37UMhAAJqRWTcmtsFa31t1L9zYyZSTNgu2k"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }, num_established: 1, concurrent_dial_errors: None }
Interval Transfer Bandwidth
0s - 10.00 s 1388 MBytes 1110.32 MBit/s
Behaviour(PerfRunDone(10.000623734s, 1388416000))
ConnectionClosed { peer_id: PeerId("12D3KooWN5vYpKMmf37UMhAAJqRWTcmtsFa31t1L9zYyZSTNgu2k"), endpoint: Listener { local_addr: "/ip4/127.0.0.1/udp/9992/quic", send_back_addr: "/ip4/127.0.0.1/udp/37458/quic" }, num_established: 0, cause: Some(IO(Custom { kind: Other, error: ApplicationClosed(ApplicationClose { error_code: 0, reason: b"" }) })) }
# Golang -> Golang
2022/09/13 00:52:41 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
Interval Transfer Bandwidth
0s - 10.00 s 1439 MBytes 1151.17 MBit/s
I see a performance degradation against https://github.com/kpp/rust-libp2p/?branch=quic-attempt-4#03ad4570 which is based on my own rebased named-futures onto main.
Thanks! Details about the test setup (amount of concurrent activity, I/O pattern, number of threads) would be interesting. Since GATs are in FCP, I'm planning to moved named-futures forwards without 'static and get it merged, along with some of the minor cleanup from the unify-drivers branch, which should make it easier to investigate soon.
Those benchmarks are in https://github.com/kpp/rust-libp2p/tree/quic-attempt-4, IO pattern is: send as much data from 1 client to a server for 10 seconds and measure the bandwidth.
Those benchmarks are in https://github.com/kpp/rust-libp2p/tree/quic-attempt-4
It's not immediately obvious to me where I should be looking.
IO pattern is: send as much data from 1 client to a server for 10 seconds and measure the bandwidth.
Using a single stream?
It's not immediately obvious to me where I should be looking.
Sorry, this is the link: https://github.com/kpp/libp2p-perf/tree/quic-attempt-4. It's better look at golang/main.go implementation to get an understanding what's going on. The Rust code does the same but with x10 boilerplate.
Using a single stream?
Yes
Hey, let me drop some related info that might help. So, after months working on my own implementation of QUIC/TLS13 for Rust I came across this https://github.com/aws/s2n-quic repository. The design looks pretty much identical to my own concept thus I believe they took the correct path and it's worth checking. The overall interface approache is pretty much how things should be designed in Rust.
Hey, thanks. I saw their project and it worth trying to implement libp2p-quic over s2n-quic to benchmark the solutions.
A comparative analysis should probably also include Cloudflare's and Mozilla's implementations. Rust has a lot of QUICs these days!
Yeah, Mozilla's implementation (Neqo) looks great and the source code is probably the most optimal, but unfortunately they don't have any plan of releasing the crates https://github.com/mozilla/neqo/issues/1299 and I think they pretty much decided what async runtime they want to support https://github.com/mozilla/neqo/issues/530. Maybe this it's a big deal after all, but I just share it here.
Let me know if you need any assistance in using or benchmarking s2n-quic.
Re: boxing and cancellation-safety: now that GATs are possible, traits can include lifetime-bearing associated futures, so it's possible to abstract over futures like Accept without boxing. That said, a single malloc there isn't likely to be measurable, so you're probably fine either way.
Re: boxing and cancellation-safety: now that GATs are possible, traits can include lifetime-bearing associated futures, so it's possible to abstract over futures like
Acceptwithout boxing. That said, a single malloc there isn't likely to be measurable, so you're probably fine either way.
I think the issue here is more that we would need self-referential structs because we would store the future in the same type as Endpoint. I don't think GATs can help us there?
The associated types can be !Unpin! pin-project-lite makes working with this type of thing very easy, as demonstrated by quinn's own wrapping of Notified.
The associated types can be
Unpin!pin-project-litemakes working with this type of thing very easy, as demonstrated by quinn's own wrapping ofNotified.
Unless I am missing something, I don't think that would work in our case? Listener can't have a lifetime because it owns Endpoint, yet we would have to store Accept in the same struct to poll for the next connection. Ideally, we would have an Endpoint::poll_accept function that we can call from the Stream implementation.
You could refactor the traits that must be implemented to propagate !Unpin futures upwards rather than requiring implementations to erase them behind a poll-style API. The extra indirection can't be avoided if you want to preserve exactly the same API, but it's worth considering an API that constrains implementations less, though probably as future work.
You could refactor the traits that must be implemented to propagate
!Unpinfutures upwards rather than requiring implementations to erase them behind a poll-style API. The extra indirection can't be avoided if you want to preserve exactly the same API, but it's worth considering an API that constrains implementations less, though probably as future work.
Thanks! Most of our APIs are designed around poll interfaces for various reasons so that is unlikely to change anytime soon.