neqo feat: Stop the PMTUD search at the interface MTU

Should we also optimistically start the search at the interface MTU, and only start from 1280 when that fails?

Sep 26 '24 12:09 larseggert

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.30%. Comparing base (b070663) to head (2b85cfb). Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2135   +/-   ##
=======================================
  Coverage   93.29%   93.30%           
=======================================
  Files         113      113           
  Lines       36709    36727   +18     
  Branches    36709    36727   +18     
=======================================
+ Hits        34248    34268   +20     
+ Misses       1678     1676    -2     
  Partials      783      783

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

Sep 26 '24 12:09 codecov[bot]

Should we also optimistically start the search at the interface MTU

Are there other projects using this optimistic approach?

If I understand RFC 8899 correctly the local interface MTU is the end value, not the start value.

The MAX_PLPMTU is the largest size of PLPMTU. This has to be less than or equal to the maximum size of the PL packet that can be sent on the outgoing interface (constrained by the local interface MTU).

https://www.rfc-editor.org/rfc/rfc8899.html#section-5.1.2

Sep 26 '24 13:09 mxinden

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to b070663a4f8f5fd042b02a9c1d08144e925ca4df.

neqo-latest as client

neqo-latest vs. haproxy: :warning:L1
neqo-latest vs. lsquic: :rocket:~~L1~~ :warning:C1
neqo-latest vs. msquic: :warning:Z A
neqo-latest vs. mvfst: A :rocket:~~L1~~ C1
neqo-latest vs. xquic: A

neqo-latest as server

lsquic vs. neqo-latest: run cancelled after 20 min
msquic vs. neqo-latest: Z U
mvfst vs. neqo-latest: Z A L1 C1
ngtcp2 vs. neqo-latest: :warning:C20 C1
quic-go vs. neqo-latest: :warning:M
quinn vs. neqo-latest: V2
xquic vs. neqo-latest: M

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: H DC LR C20 M S R :warning:Z 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. go-x-net: H DC LR M B U A L2 C2 6
neqo-latest vs. haproxy: H DC LR C20 M S R Z 3 B U A :rocket:~~L1~~ L2 C1 C2 6 V2
neqo-latest vs. kwik: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A :warning:L1 L2 :rocket:~~C1~~ C2 6 V2
neqo-latest vs. msquic: H DC LR C20 M S R :rocket:~~Z~~ B U L1 L2 C1 C2 6 V2
neqo-latest vs. mvfst: H DC LR M R Z 3 B U :warning:L1 L2 C2 6
neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. picoquic: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U A :warning:L1 L2 C1 C2 6
neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6
neqo-latest vs. xquic: H DC LR C20 M R Z 3 B U L1 L2 C1 C2 6

neqo-latest as server

aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B A :warning:L1 L2 C1 C2 6 V2
chrome vs. neqo-latest: 3
go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6
kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
msquic vs. neqo-latest: H DC LR C20 M S R B A L1 L2 C1 C2 6 V2
mvfst vs. neqo-latest: H DC LR M 3 B L2 C2 6
neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
ngtcp2 vs. neqo-latest: H DC LR :rocket:~~C20~~ M S R Z 3 B U E A L1 L2 :rocket:~~C1~~ C2 6 V2
picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2
quic-go vs. neqo-latest: H DC LR C20 :rocket:~~M~~ S R Z 3 B U A L1 L2 C1 C2 6
quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6
quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6
s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A L1 L2 C1 C2 6
xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest vs. aioquic: E
neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2
neqo-latest vs. haproxy: E
neqo-latest vs. kwik: E
neqo-latest vs. msquic: 3 E
neqo-latest vs. mvfst: C20 S E V2
neqo-latest vs. nginx: E V2
neqo-latest vs. quic-go: E V2
neqo-latest vs. quiche: E V2
neqo-latest vs. quinn: V2
neqo-latest vs. s2n-quic: Z V2
neqo-latest vs. xquic: S E V2

neqo-latest as server

aioquic vs. neqo-latest: U E
chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2
go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
kwik vs. neqo-latest: E
msquic vs. neqo-latest: 3 E
mvfst vs. neqo-latest: C20 S R U E V2
quic-go vs. neqo-latest: E V2
quiche vs. neqo-latest: C20 U E V2
s2n-quic vs. neqo-latest: C20 Z U V2
xquic vs. neqo-latest: E V2

Sep 26 '24 13:09 github-actions[bot]

All true, but in practice, the local interface is most often the limiting hop.

Sep 26 '24 13:09 larseggert

Benchmark results

Performance differences relative to ad8f3909d9ae96219e56a7c31e6822d6e64a6dc9.

decode 4096 bytes, mask ff: :green_heart: Performance has improved.

       time:   [11.198 µs 11.242 µs 11.290 µs]
       change: [-5.9617% -5.5346% -5.0685%] (p = 0.00 Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
3 (3.00%) low mild
11 (11.00%) high severe

decode 1048576 bytes, mask ff: :green_heart: Performance has improved.

       time:   [2.8827 ms 2.8902 ms 2.8995 ms]
       change: [-7.4445% -7.0117% -6.5851%] (p = 0.00 Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severe

decode 4096 bytes, mask 7f: :green_heart: Performance has improved.

       time:   [19.516 µs 19.548 µs 19.588 µs]
       change: [-1.9813% -1.4895% -1.1080%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
2 (2.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 7f: :green_heart: Performance has improved.

       time:   [5.0715 ms 5.0842 ms 5.0987 ms]
       change: [-3.4917% -2.3386% -1.5961%] (p = 0.00 Found 12 outliers among 100 measurements (12.00%)
12 (12.00%) high severe

decode 4096 bytes, mask 3f: :green_heart: Performance has improved.

       time:   [5.5312 µs 5.5475 µs 5.5699 µs]
       change: [-21.078% -19.727% -18.519%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
7 (7.00%) low mild
2 (2.00%) high mild
8 (8.00%) high severe

decode 1048576 bytes, mask 3f: :green_heart: Performance has improved.

       time:   [1.4168 ms 1.4268 ms 1.4396 ms]
       change: [-19.583% -18.994% -18.313%] (p = 0.00 Found 9 outliers among 100 measurements (9.00%)
9 (9.00%) high severe

coalesce_acked_from_zero 1+1 entries: No change in performance detected.

       time:   [98.506 ns 98.819 ns 99.144 ns]
       change: [-0.4338% +0.0330% +0.4766%] (p = 0.89 > 0.05)
Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) high mild
8 (8.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.

       time:   [116.82 ns 117.14 ns 117.49 ns]
       change: [-0.1888% +0.1344% +0.4678%] (p = 0.43 > 0.05)
Found 12 outliers among 100 measurements (12.00%)
1 (1.00%) low mild
2 (2.00%) high mild
9 (9.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.

       time:   [116.63 ns 117.20 ns 117.86 ns]
       change: [-0.0128% +0.6115% +1.2425%] (p = 0.06 > 0.05)
Found 15 outliers among 100 measurements (15.00%)
4 (4.00%) low severe
1 (1.00%) high mild
10 (10.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.

       time:   [97.408 ns 97.540 ns 97.683 ns]
       change: [+0.0501% +0.7347% +1.5543%] (p = 0.05 > 0.05)
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) high mild
7 (7.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.

       time:   [111.13 ms 111.27 ms 111.49 ms]
       change: [-1.0423% -0.7981% -0.5448%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
9 (9.00%) low mild
7 (7.00%) high mild
1 (1.00%) high severe

SentPackets::take_ranges: No change in performance detected.

       time:   [5.5276 µs 5.6914 µs 5.8570 µs]
       change: [-15.302% -0.4349% +17.872%] (p = 0.94 > 0.05)
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/varying-seeds: :broken_heart: Performance has regressed.

       time:   [42.866 ms 42.945 ms 43.024 ms]
       change: [+42.829% +45.519% +48.464%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

transfer/pacing-true/varying-seeds: :broken_heart: Performance has regressed.

       time:   [43.121 ms 43.195 ms 43.268 ms]
       change: [+36.398% +38.723% +41.287%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-false/same-seed: :broken_heart: Performance has regressed.

       time:   [42.670 ms 42.734 ms 42.806 ms]
       change: [+68.397% +72.050% +75.985%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe

transfer/pacing-true/same-seed: :broken_heart: Performance has regressed.

       time:   [42.933 ms 42.986 ms 43.039 ms]
       change: [+46.924% +51.065% +55.444%] (p = 0.00

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.

       time:   [869.17 ms 878.56 ms 888.07 ms]
       thrpt:  [112.60 MiB/s 113.82 MiB/s 115.05 MiB/s]
change:
       time:   [-0.2526% +1.3795% +2.9237%] (p = 0.09 > 0.05)
       thrpt:  [-2.8406% -1.3607% +0.2533%]

1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: :green_heart: Performance has improved.

       time:   [297.26 ms 299.20 ms 301.15 ms]
       thrpt:  [33.207 Kelem/s 33.423 Kelem/s 33.640 Kelem/s]
change:
       time:   [-7.0953% -5.9407% -4.8692%] (p = 0.00 +6.3159% +7.6372%]
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: :broken_heart: Performance has regressed.

       time:   [34.226 ms 34.430 ms 34.655 ms]
       thrpt:  [28.856  elem/s 29.044  elem/s 29.218  elem/s]
change:
       time:   [+1.4019% +2.2722% +3.1384%] (p = 0.00 -2.2218% -1.3825%]
Found 17 outliers among 100 measurements (17.00%)
8 (8.00%) low mild
3 (3.00%) high mild
6 (6.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: No change in performance detected.

       time:   [1.6597 s 1.6757 s 1.6916 s]
       thrpt:  [59.115 MiB/s 59.676 MiB/s 60.250 MiB/s]
change:
       time:   [-0.6191% +0.6644% +2.0114%] (p = 0.34 > 0.05)
       thrpt:  [-1.9717% -0.6600% +0.6230%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client	Server	CC	Pacing	MTU	Mean [ms]	Min [ms]	Max [ms]
gquiche	gquiche			1504	522.0 ± 11.2	508.6	541.0
neqo	gquiche	reno	on	1504	770.3 ± 11.3	750.5	784.1
neqo	gquiche	reno		1504	756.1 ± 14.0	738.4	786.8
neqo	gquiche	cubic	on	1504	752.8 ± 7.1	742.8	761.5
neqo	gquiche	cubic		1504	750.6 ± 9.7	739.4	769.4
msquic	msquic			1504	140.6 ± 64.3	95.0	335.7
neqo	msquic	reno	on	1504	211.6 ± 10.8	198.2	230.4
neqo	msquic	reno		1504	214.1 ± 10.9	197.5	229.1
neqo	msquic	cubic	on	1504	244.0 ± 50.8	210.4	364.2
neqo	msquic	cubic		1504	215.9 ± 9.8	199.0	227.0
gquiche	neqo	reno	on	1504	690.6 ± 94.5	563.5	825.3
gquiche	neqo	reno		1504	708.8 ± 75.8	609.8	822.9
gquiche	neqo	cubic	on	1504	671.9 ± 84.1	547.5	797.1
gquiche	neqo	cubic		1504	692.6 ± 92.7	566.1	827.8
msquic	neqo	reno	on	1504	472.8 ± 8.3	462.4	485.5
msquic	neqo	reno		1504	493.7 ± 8.8	484.0	505.6
msquic	neqo	cubic	on	1504	485.2 ± 11.6	465.4	510.8
msquic	neqo	cubic		1504	490.6 ± 48.0	457.9	623.9
neqo	neqo	reno	on	1504	494.5 ± 22.3	459.7	530.2
neqo	neqo	reno		1504	473.2 ± 28.0	443.3	525.4
neqo	neqo	cubic	on	1504	560.0 ± 37.4	506.9	648.3
neqo	neqo	cubic		1504	545.5 ± 25.2	498.2	567.1

:arrow_down: Download logs

Sep 26 '24 13:09 github-actions[bot]

This PR exposed a bug in mtu 🫤 https://github.com/mozilla/mtu/pull/26

Sep 26 '24 13:09 larseggert

Should we also optimistically start the search at the interface MTU

Are there other projects using this optimistic approach?

If I understand RFC 8899 correctly the local interface MTU is the end value, not the start value.

The MAX_PLPMTU is the largest size of PLPMTU. This has to be less than or equal to the maximum size of the PL packet that can be sent on the outgoing interface (constrained by the local interface MTU).

https://www.rfc-editor.org/rfc/rfc8899.html#section-5.1.2

All true, but in practice, the local interface is most often the limiting hop.

Let me make sure I understand the implications here correctly. Sorry for any potential mistakes.

We only start probing once the connection is confirmed.

https://github.com/mozilla/neqo/blob/55e3a9363c28632dfb29ce91c7712cab1f6a58da/neqo-transport/src/connection/mod.rs#L2794-L2802

Say that a client's path MTU is smaller than their local interface MTU. Given that probing only starts once confirmed, i.e. after receiving HANDSHAKE_DONE from the server, initial HTTP requests would not be delayed, but only one subsequent flight of requests would be delayed by up to one PTO. Is that correct?

Thus this optimization, and really all of PMTUD probing, assumes that the potential delay of one subsequent flight of HTTP requests by up to one PTO is worth the trade off of potentially increasing the overall connection throughput.

Is that correct?

Sep 26 '24 19:09 mxinden

Should we also optimistically start the search at the interface MTU

Let me make sure I understand the implications here correctly. Sorry for any potential mistakes. We only start probing once the connection is confirmed.

This would need to change. What I think we should do is roughly this:

Start sending at the local interface MTU when the connection starts (i.e., for the Initial).
If we detect a loss n times, we revert to pobing up from 1280.

n should probably be something like 2, so we don't cause undue delay.

Sep 27 '24 05:09 larseggert

In the case where a client's path MTU is smaller than their local interface MTU, this would add a delay of 2*PTO to every connection establishment, right? If so, isn't that a high cost for the potential increase in throughput? Or is such scenario just very rare?

Sep 27 '24 07:09 mxinden

Yes. I think this is a rare case, but maybe we add some telemetry first to confirm?

We could also cache a probed MTU towards a destination IP, like the OS does for a TCP MSS it has determined.

Sep 27 '24 07:09 larseggert

Yes. I think this is a rare case, but maybe we add some telemetry first to confirm?

How about we:

Role out PMTUD on Firefox Nightly without the optimization (i.e. starting at the local interface MTU).
Enable the optimization, monitoring whether connection establishment times stay stable.

Sep 27 '24 08:09 mxinden

I was thinking we just log the log the local interface MTU together with the discovered PMTUD, and check for differences.

Sep 27 '24 08:09 larseggert

We need to bump the rayon crate in mozilla-central to 1.7, which is what windows-bindgen requires, before we can land this PR and bring in mtu as a dependency.

(See https://github.com/larseggert/neqo/actions/runs/11972576345/job/33379768209#step:8:86 for Firefox build failure.)

@mxinden @valenting @KershawChang any tips on how to do this?

Nov 25 '24 07:11 larseggert

We will need to update the following crates:

➜  cargo tree --invert=rayon
rayon v1.6.1
├── style v0.0.1
│   ├── geckoservo v0.0.1
│   │   └── gkrust-shared v0.1.0
│   │       ├── gkrust v0.1.0
│   │       └── gkrust-gtest v0.1.0
│   └── stylo_tests v0.0.1
│       [dev-dependencies]
│       └── gkrust v0.1.0
├── webrender v0.62.0
│   └── webrender_bindings v0.1.0
│       └── gkrust-shared v0.1.0
├── webrender_bindings v0.1.0
└── wr_glyph_rasterizer v0.1.0
    └── webrender v0.62.0

Note that, given rayon 1.7 is a minor release only, i.e. non-breaking, we might get away with a subset.

For the record, windows-bindgen introduced rayon v1.7 dependency in windows-bindgen v0.49.0. If it would have been a more recent version, it might have been worth exploring supporting both the most recent and the one without rayon, similar to what we did in https://github.com/quinn-rs/quinn/actions/runs/11553182968/job/32153809202?pr=2021.

Nov 25 '24 08:11 mxinden

➜  mozilla-central git:(bookmarks/central) cargo update -p rayon  --precise 1.7.0
    Updating crates.io index
    Updating rayon v1.6.1 -> v1.7.0

Neat. It seems to apply without issues.

@larseggert I can create a mozilla-central Phabricator patch if you want. In case there aren't any hidden issues, the only real work will be auditing rayon v1.6.1 -> v1.7.0 code changes.

Nov 25 '24 08:11 mxinden

@larseggert I can create a mozilla-central Phabricator patch if you want. In case there aren't any hidden issues, the only real work will be auditing rayon v1.6.1 -> v1.7.0 code changes.

That would be great, thanks. (I'm surprised mozilla-central doesn't have some sort of auto-upgrade of dependencies...)

Nov 25 '24 08:11 larseggert

For the record, updating mozilla-central to a recent rayon version is happening through https://bugzilla.mozilla.org/show_bug.cgi?id=1933199.

Nov 25 '24 16:11 mxinden

For the record, updating mozilla-central to a recent rayon version is happening through https://bugzilla.mozilla.org/show_bug.cgi?id=1933199.

rayon v1.10.0 landed in mozilla-central last night.

Nov 29 '24 07:11 mxinden

Thanks, saw it. Trying to make mtu importable.

Nov 29 '24 07:11 larseggert

Next blocker:

Missing audit for windows-bindgen:0.58.0 (requires ['safe-to-deploy']).

Nov 29 '24 08:11 larseggert

@mxinden should we update more deps on m-c before merging this, or do we want to handle that when we vendor neqo into m-c the next time?

Dec 17 '24 11:12 larseggert

Other than the rayon (:heavy_check_mark:) update and the windows-bindgen audit, what else is missing @larseggert?

Dec 17 '24 13:12 mxinden

Other than the rayon (:heavy_check_mark:) update and the windows-bindgen audit, what else is missing @larseggert?

I think windows-metadata, but that is in the same category as windows-bindgen.

Dec 17 '24 14:12 larseggert

If it is audits only, I suggest to "handle that when we vendor neqo into m-c the next time".

Though note, auditing windows-bindgen from scratch might be a significant undertaking.

Dec 17 '24 14:12 mxinden

Though note, auditing windows-bindgen from scratch might be a significant undertaking.

I don't think we need to - the author is trusted.

Dec 18 '24 14:12 larseggert