feat: Basic GSO support
This simply collects batches of same-size, same-marked datagrams to the same destination together by copying. In essence, we trade more memory copies for fewer system calls. Let's see if this matters at all.
All QNS tests are failing. I see this in the logs:
server | 1.021 INFO `libc::sendmsg` failed with Input/output error (os error 5); halting segmentation offload
server | Error: IoError(Os { code: 5, kind: Uncategorized, message: "Input/output error" })
Benchmark results
Performance differences relative to 06c007eefd013aacaf2fecc189370982f41bc278.
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: :green_heart: Performance has improved.
time: [278.77 ms 282.23 ms 285.68 ms]
thrpt: [350.04 MiB/s 354.33 MiB/s 358.71 MiB/s]
change:
time: [-69.967% -68.844% -67.628%] (p = 0.00 +220.96% +232.97%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: :broken_heart: Performance has regressed.
time: [388.79 ms 393.18 ms 398.26 ms]
thrpt: [25.110 Kelem/s 25.434 Kelem/s 25.721 Kelem/s]
change:
time: [+15.492% +16.873% +18.319%] (p = 0.00 -14.437% -13.414%]
Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: :broken_heart: Performance has regressed.
time: [26.470 ms 26.595 ms 26.736 ms]
thrpt: [37.403 elem/s 37.601 elem/s 37.779 elem/s]
change:
time: [+2.5803% +3.3466% +4.1706%] (p = 0.00 -3.2382% -2.5154%]
Found 6 outliers among 100 measurements (6.00%)
4 (4.00%) high mild
2 (2.00%) high severe
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: :green_heart: Performance has improved.
time: [1.1900 s 1.2022 s 1.2144 s]
thrpt: [82.345 MiB/s 83.180 MiB/s 84.033 MiB/s]
change:
time: [-34.210% -33.125% -32.145%] (p = 0.00 +49.532% +51.999%]
decode 4096 bytes, mask ff: No change in performance detected.
time: [12.072 µs 12.105 µs 12.145 µs]
change: [-0.9610% -0.1559% +0.4897%] (p = 0.72 > 0.05)
Found 19 outliers among 100 measurements (19.00%)
3 (3.00%) low severe
5 (5.00%) low mild
2 (2.00%) high mild
9 (9.00%) high severe
decode 1048576 bytes, mask ff: No change in performance detected.
time: [3.1327 ms 3.1428 ms 3.1543 ms]
change: [-0.5210% -0.0295% +0.5089%] (p = 0.91 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low mild
10 (10.00%) high severe
decode 4096 bytes, mask 7f: No change in performance detected.
time: [20.152 µs 20.203 µs 20.260 µs]
change: [-0.8010% -0.3795% +0.0250%] (p = 0.08 > 0.05)
Found 21 outliers among 100 measurements (21.00%)
4 (4.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
12 (12.00%) high severe
decode 1048576 bytes, mask 7f: No change in performance detected.
time: [5.2468 ms 5.2583 ms 5.2714 ms]
change: [-0.4201% -0.0685% +0.2965%] (p = 0.72 > 0.05)
Found 14 outliers among 100 measurements (14.00%)
14 (14.00%) high severe
decode 4096 bytes, mask 3f: No change in performance detected.
time: [7.0353 µs 7.0775 µs 7.1224 µs]
change: [-1.8397% -0.2449% +1.1385%] (p = 0.78 > 0.05)
Found 18 outliers among 100 measurements (18.00%)
3 (3.00%) low severe
2 (2.00%) low mild
13 (13.00%) high severe
decode 1048576 bytes, mask 3f: No change in performance detected.
time: [1.7921 ms 1.7978 ms 1.8048 ms]
change: [-0.4674% +0.0096% +0.5485%] (p = 0.92 > 0.05)
Found 6 outliers among 100 measurements (6.00%)
6 (6.00%) high severe
1000 streams of 1 bytes/multistream: :broken_heart: Performance has regressed.
time: [24.129 ms 24.154 ms 24.180 ms]
change: [+2.0800% +2.2246% +2.3872%] (p = 0.00 Found 69 outliers among 500 measurements (13.80%)
65 (13.00%) high mild
4 (0.80%) high severe1000 streams of 1000 bytes/multistream: Change within noise threshold.
time: [140.51 ms 140.55 ms 140.58 ms]
change: [+0.0462% +0.0825% +0.1196%] (p = 0.00 Found 13 outliers among 500 measurements (2.60%)
13 (2.60%) high mildcoalesce_acked_from_zero 1+1 entries: No change in performance detected.
time: [94.665 ns 94.974 ns 95.290 ns]
change: [-0.6382% -0.1315% +0.4650%] (p = 0.64 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
9 (9.00%) high mild
2 (2.00%) high severe
coalesce_acked_from_zero 3+1 entries: No change in performance detected.
time: [112.70 ns 113.01 ns 113.34 ns]
change: [-0.0477% +0.3468% +0.7348%] (p = 0.09 > 0.05)
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
6 (6.00%) high mild
8 (8.00%) high severe
coalesce_acked_from_zero 10+1 entries: No change in performance detected.
time: [112.13 ns 112.65 ns 113.24 ns]
change: [-0.3830% +0.1121% +0.6339%] (p = 0.67 > 0.05)
Found 17 outliers among 100 measurements (17.00%)
4 (4.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
8 (8.00%) high severe
coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
time: [93.091 ns 95.261 ns 99.710 ns]
change: [-1.0510% +4.7836% +15.533%] (p = 0.51 > 0.05)
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) high mild
4 (4.00%) high severe
RxStreamOrderer::inbound_frame(): Change within noise threshold.
time: [117.44 ms 117.50 ms 117.56 ms]
change: [+0.5315% +0.5961% +0.6604%] (p = 0.00 Found 16 outliers among 100 measurements (16.00%)
1 (1.00%) low severe
6 (6.00%) low mild
5 (5.00%) high mild
4 (4.00%) high severeSentPackets::take_ranges: No change in performance detected.
time: [8.2525 µs 8.4959 µs 8.7178 µs]
change: [-4.0732% -0.5984% +3.1432%] (p = 0.75 > 0.05)
Found 20 outliers among 100 measurements (20.00%)
9 (9.00%) low severe
9 (9.00%) low mild
2 (2.00%) high mild
transfer/pacing-false/varying-seeds: Change within noise threshold.
time: [35.206 ms 35.274 ms 35.342 ms]
change: [-2.4515% -2.1790% -1.9193%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mildtransfer/pacing-true/varying-seeds: Change within noise threshold.
time: [36.298 ms 36.401 ms 36.504 ms]
change: [-2.4249% -2.0121% -1.6339%] (p = 0.00 transfer/pacing-false/same-seed: Change within noise threshold.
time: [35.134 ms 35.181 ms 35.229 ms]
change: [-2.2420% -2.0430% -1.8495%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/pacing-true/same-seed: Change within noise threshold.
time: [36.650 ms 36.714 ms 36.778 ms]
change: [-2.9347% -2.7334% -2.4915%] (p = 0.00 Client/server transfer results
Performance differences relative to 06c007eefd013aacaf2fecc189370982f41bc278.
Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.
| Client | Server | CC | Pacing | Mean ± σ | Min | Max | MiB/s ± σ | Δ main |
Δ main |
|---|---|---|---|---|---|---|---|---|---|
| neqo | neqo | reno | on | 225.5 ± 71.0 | 175.3 | 412.3 | 141.9 ± 0.5 | :green_heart: -132.3 | -37.0% |
| neqo | neqo | reno | 283.1 ± 223.1 | 176.8 | 1163.2 | 113.0 ± 0.1 | :green_heart: -104.5 | -27.0% | |
| neqo | neqo | cubic | on | 211.7 ± 51.1 | 178.9 | 400.0 | 151.2 ± 0.6 | :green_heart: -131.9 | -38.4% |
| neqo | neqo | cubic | 210.4 ± 56.5 | 175.7 | 449.6 | 152.1 ± 0.6 | :green_heart: -134.1 | -38.9% | |
| neqo | reno | on | 726.8 ± 120.8 | 441.1 | 988.5 | 44.0 ± 0.3 | -55.9 | -7.1% | |
| neqo | reno | 713.5 ± 120.5 | 448.1 | 959.5 | 44.8 ± 0.3 | -51.9 | -6.8% | ||
| neqo | cubic | on | 717.0 ± 115.3 | 487.2 | 970.1 | 44.6 ± 0.3 | -46.5 | -6.1% | |
| neqo | cubic | 710.6 ± 108.4 | 469.5 | 934.9 | 45.0 ± 0.3 | :green_heart: -55.0 | -7.2% | ||
| 591.7 ± 69.3 | 547.2 | 864.4 | 54.1 ± 0.5 | 19.4 | 3.4% | ||||
| neqo | msquic | reno | on | 271.7 ± 43.8 | 241.8 | 445.1 | 117.8 ± 0.7 | -3.9 | -1.4% |
| neqo | msquic | reno | 272.8 ± 44.0 | 239.7 | 438.0 | 117.3 ± 0.7 | 5.7 | 2.1% | |
| neqo | msquic | cubic | on | 265.5 ± 36.0 | 240.0 | 446.4 | 120.5 ± 0.9 | -3.7 | -1.4% |
| neqo | msquic | cubic | 272.1 ± 45.3 | 243.0 | 474.1 | 117.6 ± 0.7 | 0.2 | 0.1% | |
| msquic | msquic | 186.1 ± 26.4 | 158.9 | 291.3 | 172.0 ± 1.2 | -10.1 | -5.2% |
Failed Interop Tests
QUIC Interop Runner, client vs. server, differences relative to 06c007eefd013aacaf2fecc189370982f41bc278.
neqo-latest as client
- neqo-latest vs. aioquic: :rocket:~~S R~~ :warning:B U :rocket:~~A~~ L1 L2 :rocket:~~C1 BP~~ :warning:C2 6 V2
- neqo-latest vs. go-x-net: :rocket:~~DC M C2 6~~ :warning:H U BP BA
- neqo-latest vs. haproxy: :rocket:~~C20~~ :warning:M S R Z :rocket:~~L1~~ :warning:3 C2 6 V2 BP BA
- neqo-latest vs. kwik: :warning:DC C20 M S R :warning:Z 3 U :warning:A L1 :rocket:~~L2~~ C1 :rocket:~~C2~~ :warning:6 BP BA
- neqo-latest vs. linuxquic: :rocket:~~M~~ R :rocket:~~Z U E~~ L1 C1 :rocket:~~6~~ V2 :rocket:~~BP~~
- neqo-latest vs. lsquic: :rocket:~~C20 Z~~ :warning:DC LR 3 :warning:B L1 C1 C2 :warning:6 BP
- neqo-latest vs. msquic: :warning:H DC M S R :rocket:~~Z~~ A L1 :rocket:~~L2~~ C1 :warning:6 BP BA
- neqo-latest vs. mvfst: :warning:LR M R :warning:Z 3 A :rocket:~~L1 C2 6~~ :warning:C1 BP :rocket:~~BA~~
- neqo-latest vs. neqo: LR :warning:C20 M :warning:S Z 3 U A :warning:C1 C2 6 V2 :rocket:~~BP~~
- neqo-latest vs. neqo-latest: :rocket:~~C20 Z 3 U~~ :warning:DC R E A :rocket:~~C2 6 CM~~ :warning:BP
- neqo-latest vs. nginx: :rocket:~~R U~~ :warning:H DC C20 M 6 BP BA
- neqo-latest vs. ngtcp2: :warning:LR C20 :warning:M B :warning:U A L2 :rocket:~~C1 BP~~ :warning:6 CM
- neqo-latest vs. picoquic: :rocket:~~LR B~~ :warning:H C20 Z A L1 :warning:C1 6 :rocket:~~V2~~
- neqo-latest vs. quic-go: :rocket:~~M Z~~ A :rocket:~~L2 BP~~ :warning:L1 C2
- neqo-latest vs. quiche: :rocket:~~H~~ C20 M :warning:S R Z 3 :rocket:~~L1 L2 C2~~ :warning:A BP BA
- neqo-latest vs. quinn: :rocket:~~H DC~~ :warning:Z U :rocket:~~E~~ A :rocket:~~6 BA~~
- neqo-latest vs. s2n-quic: :rocket:~~M~~ :warning:H LR C20 3 B U :warning:A C1 :warning:6 BA CM
- neqo-latest vs. tquic: :rocket:~~H~~ DC :rocket:~~M~~ :warning:C20 S :rocket:~~Z B L1~~ :warning:A C2 BP BA
- neqo-latest vs. xquic: :rocket:~~H~~ :warning:LR 3 B A L1 C1 :rocket:~~C2 BA~~
neqo-latest as server
- aioquic vs. neqo-latest: :warning:M S R Z 3 :rocket:~~6 V2~~ :warning:BA CM
- chrome vs. neqo-latest: 3
- go-x-net vs. neqo-latest: :rocket:~~H~~ DC :warning:LR C2 CM
- kwik vs. neqo-latest: :warning:LR Z 3 B C2 :rocket:~~BA~~ :warning:V2 BP CM
- linuxquic vs. neqo-latest: :warning:R Z 3 BP BA
- lsquic vs. neqo-latest: :rocket:~~H S~~ :warning:3 C1 V2
- msquic vs. neqo-latest: :rocket:~~H~~ DC :rocket:~~C20 B 6 V2 CM~~ :warning:L2 C2
- mvfst vs. neqo-latest: :warning:DC Z :rocket:~~3~~ :warning:B A L1 L2 C1 C2 :warning:6
- neqo vs. neqo-latest: :warning:DC 3 B :rocket:~~E~~ A L2 C1 BP :warning:CM
- ngtcp2 vs. neqo-latest: H :rocket:~~C20 S R Z 3 B~~ :warning:LR C2 6 V2
- openssl vs. neqo-latest: :rocket:~~DC~~ LR M S :rocket:~~R 3~~ A :rocket:~~BP~~ :warning:C2 CM
- picoquic vs. neqo-latest: run cancelled after 20 min
- quic-go vs. neqo-latest: LR :rocket:~~3 C1~~ :warning:R B U L1 C2 CM
- quiche vs. neqo-latest: :rocket:~~M A C2 6~~ :warning:LR L1 C1 BA CM
- quinn vs. neqo-latest: :rocket:~~H LR C20 R~~ :warning:S A L1 :rocket:~~L2 C1~~ :warning:C2 6 V2 CM
- s2n-quic vs. neqo-latest: :rocket:~~DC E~~ :warning:R L1 6 CM
- tquic vs. neqo-latest: run cancelled after 20 min
- xquic vs. neqo-latest: :warning:H DC B 6 CM
All results
Succeeded Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: H DC LR C20 M :rocket:~~S R~~ Z 3 :warning:B C2 6 V2 :rocket:~~A C1 BP~~ BA
- neqo-latest vs. go-x-net: :warning:H :rocket:~~DC~~ LR :rocket:~~M~~ B :warning:U A L2 :rocket:~~C2 6~~
- neqo-latest vs. haproxy: H DC LR :warning:M S R 3 :rocket:~~C20~~ B U A :rocket:~~L1~~ L2 C1 :warning:C2 6 V2
- neqo-latest vs. kwik: H :warning:DC LR :warning:C20 M S Z 3 B :warning:A 6 :rocket:~~L2 C2~~ V2
- neqo-latest vs. linuxquic: H DC LR C20 :rocket:~~M~~ S :rocket:~~Z~~ 3 B :rocket:~~U E~~ A L2 C2 :rocket:~~6 BP~~ BA CM
- neqo-latest vs. lsquic: H :warning:DC LR :rocket:~~C20~~ M S R :warning:B :rocket:~~Z~~ U E A L2 :warning:6 V2 :warning:BP BA CM
- neqo-latest vs. msquic: :warning:H DC LR C20 :warning:M :rocket:~~Z~~ B U :rocket:~~L2~~ C2 :warning:6 V2 :warning:BP
- neqo-latest vs. mvfst: H DC :warning:LR M Z B U :rocket:~~L1~~ L2 :warning:C1 :rocket:~~C2 6 BA~~
- neqo-latest vs. neqo: H DC :warning:C20 S R :warning:Z 3 B :warning:U E L1 L2 :warning:C1 C2 6 :rocket:~~BP~~ BA CM
- neqo-latest vs. neqo-latest: H :warning:DC LR :rocket:~~C20~~ M S :warning:R :rocket:~~Z 3~~ B :warning:E :rocket:~~U~~ L1 L2 C1 :rocket:~~C2 6~~ V2 :warning:BP BA :rocket:~~CM~~
- neqo-latest vs. nginx: :warning:H DC LR :warning:C20 M S :rocket:~~R~~ Z 3 B :rocket:~~U~~ A L1 L2 C1 C2 :warning:6
- neqo-latest vs. ngtcp2: H DC :warning:LR M S R Z 3 :warning:U E :warning:A L1 :rocket:~~C1~~ C2 :warning:6 V2 :rocket:~~BP~~ BA
- neqo-latest vs. picoquic: :warning:H DC :warning:C20 :rocket:~~LR~~ M S R :warning:Z 3 :rocket:~~B~~ U E L2 :warning:C1 C2 :rocket:~~V2~~ BP BA
- neqo-latest vs. quic-go: H DC LR C20 :rocket:~~M~~ S R :rocket:~~Z~~ 3 B U :warning:L1 :rocket:~~L2~~ C1 :warning:C2 6 :rocket:~~BP~~ BA
- neqo-latest vs. quiche: :rocket:~~H~~ DC LR :warning:S B U :warning:A :rocket:~~L1 L2~~ C1 :rocket:~~C2~~ 6
- neqo-latest vs. quinn: :rocket:~~H DC~~ LR C20 M S R :warning:Z 3 B :rocket:~~E~~ L1 L2 C1 C2 :rocket:~~6~~ BP :rocket:~~BA~~
- neqo-latest vs. s2n-quic: :warning:H DC :warning:LR C20 :rocket:~~M~~ S R :warning:3 B E :warning:A L1 L2 C2 :warning:6 BP
- neqo-latest vs. tquic: :rocket:~~H~~ LR :warning:C20 :rocket:~~M~~ R :rocket:~~Z~~ 3 :rocket:~~B~~ U :warning:A :rocket:~~L1~~ L2 C1 6
- neqo-latest vs. xquic: :rocket:~~H~~ DC :warning:LR C20 M R Z :warning:3 B U L2 :rocket:~~C2~~ 6 BP :rocket:~~BA~~
neqo-latest as server
- aioquic vs. neqo-latest: H DC LR C20 :warning:M S R Z B A L1 L2 C1 C2 :rocket:~~6 V2~~ BP :warning:BA
- go-x-net vs. neqo-latest: :warning:LR :rocket:~~H M~~ B :warning:U :rocket:~~A~~ L2 :warning:C2 6 BP BA
- kwik vs. neqo-latest: H DC :warning:LR C20 :rocket:~~M~~ S :warning:Z :rocket:~~R~~ U :warning:A L1 L2 C1 6 :warning:V2
- linuxquic vs. neqo-latest: :rocket:~~H DC LR C20 M S B U E A L1 L2 C1 C2 6 V2 CM~~
- lsquic vs. neqo-latest: :rocket:~~H~~ DC LR M :warning:3 :rocket:~~S R~~ B A L1 L2 :warning:C1 C2 6 BP :warning:BA :rocket:~~CM~~
- msquic vs. neqo-latest: :rocket:~~H~~ LR :rocket:~~C20~~ M S R Z :rocket:~~B~~ A L1 :warning:L2 C1 :warning:C2 BP :rocket:~~6 V2~~ BA
- mvfst vs. neqo-latest: H :warning:DC LR :warning:B 6 :rocket:~~3~~ BP :warning:BA
- neqo vs. neqo-latest: H :warning:DC LR C20 M S R Z U :rocket:~~E~~ L1 C2 6 V2 BA :warning:CM
- ngtcp2 vs. neqo-latest: DC :warning:LR :rocket:~~C20~~ M :rocket:~~S R Z 3 B~~ U :rocket:~~E~~ A L1 L2 C1 :warning:C2 6 V2 BP :rocket:~~BA~~ CM
- openssl vs. neqo-latest: H :rocket:~~DC~~ C20 :rocket:~~R 3~~ B L2 :warning:C2 6 :rocket:~~BP~~ BA
- quic-go vs. neqo-latest: H DC C20 M S :warning:R Z :warning:B U :rocket:~~3~~ A :warning:L1 L2 :warning:C2 :rocket:~~C1~~ 6 BP BA
- quiche vs. neqo-latest: H DC :warning:LR :rocket:~~M~~ S R Z 3 B :warning:L1 :rocket:~~A~~ L2 :warning:C1 :rocket:~~C2 6~~ BP :warning:BA
- quinn vs. neqo-latest: :rocket:~~H~~ DC :rocket:~~LR C20~~ M :warning:S :rocket:~~R~~ Z 3 B U E :warning:A C2 :rocket:~~L2 C1~~ BP BA
- s2n-quic vs. neqo-latest: H :rocket:~~DC~~ LR M S :warning:R 3 B :rocket:~~E~~ A :warning:L1 L2 C1 C2 :warning:6 BP BA
- xquic vs. neqo-latest: :rocket:~~LR C20 S R Z 3 U A L1 L2 C1 C2 BA~~
Unsupported Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: E CM
- neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
- neqo-latest vs. haproxy: E CM
- neqo-latest vs. kwik: E CM
- neqo-latest vs. msquic: 3 E CM
- neqo-latest vs. mvfst: C20 S E V2 CM
- neqo-latest vs. nginx: E V2 CM
- neqo-latest vs. picoquic: CM
- neqo-latest vs. quic-go: E V2 CM
- neqo-latest vs. quiche: E V2 CM
- neqo-latest vs. quinn: V2 CM
- neqo-latest vs. s2n-quic: Z V2
- neqo-latest vs. tquic: E V2 CM
- neqo-latest vs. xquic: S E V2 CM
neqo-latest as server
- aioquic vs. neqo-latest: U E
- chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
- go-x-net vs. neqo-latest: C20 M S R Z 3 U E A L1 C1 V2
- kwik vs. neqo-latest: M R E BP A BA
- lsquic vs. neqo-latest: C20 R Z U E CM BA
- msquic vs. neqo-latest: 3 U E BP CM
- mvfst vs. neqo-latest: C20 M S R U E A V2 BA CM
- ngtcp2 vs. neqo-latest: E BA BP
- openssl vs. neqo-latest: Z U E L1 C1 V2
- quic-go vs. neqo-latest: E V2
- quiche vs. neqo-latest: C20 U E V2
- s2n-quic vs. neqo-latest: C20 Z U V2
- xquic vs. neqo-latest: M E V2 BP
I have started to do a version of this in the glue code. It's a bit challenging because the current mainline of neqo has picked up a bunch of dependencies beyond that of Firefox, and I need to figure out how to upgrade those...
Am wondering if we should cut a neqo release soon before there is more divergence.
Am wondering if we should cut a neqo release soon before there is more divergence.
I was planning to cut a new release once https://github.com/mozilla/neqo/pull/2492 is merged. @larseggert I am happy to cut a new release beforehand if you like.
Making this a draft PR, since we decided to test this in the glue first.
Closing here in favor of #2593.