neqo icon indicating copy to clipboard operation
neqo copied to clipboard

feat: Stop the PMTUD search at the interface MTU

Open larseggert opened this issue 1 year ago • 12 comments

Should we also optimistically start the search at the interface MTU, and only start from 1280 when that fails?

larseggert avatar Sep 26 '24 12:09 larseggert

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 93.30%. Comparing base (b070663) to head (2b85cfb). Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2135   +/-   ##
=======================================
  Coverage   93.29%   93.30%           
=======================================
  Files         113      113           
  Lines       36709    36727   +18     
  Branches    36709    36727   +18     
=======================================
+ Hits        34248    34268   +20     
+ Misses       1678     1676    -2     
  Partials      783      783           

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Sep 26 '24 12:09 codecov[bot]

Should we also optimistically start the search at the interface MTU

Are there other projects using this optimistic approach?

If I understand RFC 8899 correctly the local interface MTU is the end value, not the start value.

The MAX_PLPMTU is the largest size of PLPMTU. This has to be less than or equal to the maximum size of the PL packet that can be sent on the outgoing interface (constrained by the local interface MTU).

https://www.rfc-editor.org/rfc/rfc8899.html#section-5.1.2

mxinden avatar Sep 26 '24 13:09 mxinden

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to b070663a4f8f5fd042b02a9c1d08144e925ca4df.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

github-actions[bot] avatar Sep 26 '24 13:09 github-actions[bot]

All true, but in practice, the local interface is most often the limiting hop.

larseggert avatar Sep 26 '24 13:09 larseggert

Benchmark results

Performance differences relative to ad8f3909d9ae96219e56a7c31e6822d6e64a6dc9.

decode 4096 bytes, mask ff: :green_heart: Performance has improved.
       time:   [11.198 µs 11.242 µs 11.290 µs]
       change: [-5.9617% -5.5346% -5.0685%] (p = 0.00 Found 16 outliers among 100 measurements (16.00%)
2 (2.00%) low severe
3 (3.00%) low mild
11 (11.00%) high severe
decode 1048576 bytes, mask ff: :green_heart: Performance has improved.
       time:   [2.8827 ms 2.8902 ms 2.8995 ms]
       change: [-7.4445% -7.0117% -6.5851%] (p = 0.00 Found 7 outliers among 100 measurements (7.00%)
1 (1.00%) high mild
6 (6.00%) high severe
decode 4096 bytes, mask 7f: :green_heart: Performance has improved.
       time:   [19.516 µs 19.548 µs 19.588 µs]
       change: [-1.9813% -1.4895% -1.1080%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
2 (2.00%) low severe
2 (2.00%) low mild
3 (3.00%) high mild
10 (10.00%) high severe
decode 1048576 bytes, mask 7f: :green_heart: Performance has improved.
       time:   [5.0715 ms 5.0842 ms 5.0987 ms]
       change: [-3.4917% -2.3386% -1.5961%] (p = 0.00 Found 12 outliers among 100 measurements (12.00%)
12 (12.00%) high severe
decode 4096 bytes, mask 3f: :green_heart: Performance has improved.
       time:   [5.5312 µs 5.5475 µs 5.5699 µs]
       change: [-21.078% -19.727% -18.519%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
7 (7.00%) low mild
2 (2.00%) high mild
8 (8.00%) high severe
decode 1048576 bytes, mask 3f: :green_heart: Performance has improved.
       time:   [1.4168 ms 1.4268 ms 1.4396 ms]
       change: [-19.583% -18.994% -18.313%] (p = 0.00 Found 9 outliers among 100 measurements (9.00%)
9 (9.00%) high severe
coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [98.506 ns 98.819 ns 99.144 ns]
       change: [-0.4338% +0.0330% +0.4766%] (p = 0.89 > 0.05)

Found 9 outliers among 100 measurements (9.00%) 1 (1.00%) high mild 8 (8.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [116.82 ns 117.14 ns 117.49 ns]
       change: [-0.1888% +0.1344% +0.4678%] (p = 0.43 > 0.05)

Found 12 outliers among 100 measurements (12.00%) 1 (1.00%) low mild 2 (2.00%) high mild 9 (9.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [116.63 ns 117.20 ns 117.86 ns]
       change: [-0.0128% +0.6115% +1.2425%] (p = 0.06 > 0.05)

Found 15 outliers among 100 measurements (15.00%) 4 (4.00%) low severe 1 (1.00%) high mild 10 (10.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [97.408 ns 97.540 ns 97.683 ns]
       change: [+0.0501% +0.7347% +1.5543%] (p = 0.05 > 0.05)

Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) high mild 7 (7.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [111.13 ms 111.27 ms 111.49 ms]
       change: [-1.0423% -0.7981% -0.5448%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
9 (9.00%) low mild
7 (7.00%) high mild
1 (1.00%) high severe
SentPackets::take_ranges: No change in performance detected.
       time:   [5.5276 µs 5.6914 µs 5.8570 µs]
       change: [-15.302% -0.4349% +17.872%] (p = 0.94 > 0.05)

Found 4 outliers among 100 measurements (4.00%) 3 (3.00%) high mild 1 (1.00%) high severe

transfer/pacing-false/varying-seeds: :broken_heart: Performance has regressed.
       time:   [42.866 ms 42.945 ms 43.024 ms]
       change: [+42.829% +45.519% +48.464%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
transfer/pacing-true/varying-seeds: :broken_heart: Performance has regressed.
       time:   [43.121 ms 43.195 ms 43.268 ms]
       change: [+36.398% +38.723% +41.287%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
transfer/pacing-false/same-seed: :broken_heart: Performance has regressed.
       time:   [42.670 ms 42.734 ms 42.806 ms]
       change: [+68.397% +72.050% +75.985%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
transfer/pacing-true/same-seed: :broken_heart: Performance has regressed.
       time:   [42.933 ms 42.986 ms 43.039 ms]
       change: [+46.924% +51.065% +55.444%] (p = 0.00 
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [869.17 ms 878.56 ms 888.07 ms]
       thrpt:  [112.60 MiB/s 113.82 MiB/s 115.05 MiB/s]
change:
       time:   [-0.2526% +1.3795% +2.9237%] (p = 0.09 > 0.05)
       thrpt:  [-2.8406% -1.3607% +0.2533%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: :green_heart: Performance has improved.
       time:   [297.26 ms 299.20 ms 301.15 ms]
       thrpt:  [33.207 Kelem/s 33.423 Kelem/s 33.640 Kelem/s]
change:
       time:   [-7.0953% -5.9407% -4.8692%] (p = 0.00 +6.3159% +7.6372%]

Found 4 outliers among 100 measurements (4.00%) 2 (2.00%) low mild 2 (2.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: :broken_heart: Performance has regressed.
       time:   [34.226 ms 34.430 ms 34.655 ms]
       thrpt:  [28.856  elem/s 29.044  elem/s 29.218  elem/s]
change:
       time:   [+1.4019% +2.2722% +3.1384%] (p = 0.00 -2.2218% -1.3825%]

Found 17 outliers among 100 measurements (17.00%) 8 (8.00%) low mild 3 (3.00%) high mild 6 (6.00%) high severe

1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: No change in performance detected.
       time:   [1.6597 s 1.6757 s 1.6916 s]
       thrpt:  [59.115 MiB/s 59.676 MiB/s 60.250 MiB/s]
change:
       time:   [-0.6191% +0.6644% +2.0114%] (p = 0.34 > 0.05)
       thrpt:  [-1.9717% -0.6600% +0.6230%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing MTU Mean [ms] Min [ms] Max [ms]
gquiche gquiche 1504 522.0 ± 11.2 508.6 541.0
neqo gquiche reno on 1504 770.3 ± 11.3 750.5 784.1
neqo gquiche reno 1504 756.1 ± 14.0 738.4 786.8
neqo gquiche cubic on 1504 752.8 ± 7.1 742.8 761.5
neqo gquiche cubic 1504 750.6 ± 9.7 739.4 769.4
msquic msquic 1504 140.6 ± 64.3 95.0 335.7
neqo msquic reno on 1504 211.6 ± 10.8 198.2 230.4
neqo msquic reno 1504 214.1 ± 10.9 197.5 229.1
neqo msquic cubic on 1504 244.0 ± 50.8 210.4 364.2
neqo msquic cubic 1504 215.9 ± 9.8 199.0 227.0
gquiche neqo reno on 1504 690.6 ± 94.5 563.5 825.3
gquiche neqo reno 1504 708.8 ± 75.8 609.8 822.9
gquiche neqo cubic on 1504 671.9 ± 84.1 547.5 797.1
gquiche neqo cubic 1504 692.6 ± 92.7 566.1 827.8
msquic neqo reno on 1504 472.8 ± 8.3 462.4 485.5
msquic neqo reno 1504 493.7 ± 8.8 484.0 505.6
msquic neqo cubic on 1504 485.2 ± 11.6 465.4 510.8
msquic neqo cubic 1504 490.6 ± 48.0 457.9 623.9
neqo neqo reno on 1504 494.5 ± 22.3 459.7 530.2
neqo neqo reno 1504 473.2 ± 28.0 443.3 525.4
neqo neqo cubic on 1504 560.0 ± 37.4 506.9 648.3
neqo neqo cubic 1504 545.5 ± 25.2 498.2 567.1

:arrow_down: Download logs

github-actions[bot] avatar Sep 26 '24 13:09 github-actions[bot]

This PR exposed a bug in mtu 🫤 https://github.com/mozilla/mtu/pull/26

larseggert avatar Sep 26 '24 13:09 larseggert

Should we also optimistically start the search at the interface MTU

Are there other projects using this optimistic approach?

If I understand RFC 8899 correctly the local interface MTU is the end value, not the start value.

The MAX_PLPMTU is the largest size of PLPMTU. This has to be less than or equal to the maximum size of the PL packet that can be sent on the outgoing interface (constrained by the local interface MTU).

https://www.rfc-editor.org/rfc/rfc8899.html#section-5.1.2

All true, but in practice, the local interface is most often the limiting hop.

Let me make sure I understand the implications here correctly. Sorry for any potential mistakes.

We only start probing once the connection is confirmed.

https://github.com/mozilla/neqo/blob/55e3a9363c28632dfb29ce91c7712cab1f6a58da/neqo-transport/src/connection/mod.rs#L2794-L2802

Say that a client's path MTU is smaller than their local interface MTU. Given that probing only starts once confirmed, i.e. after receiving HANDSHAKE_DONE from the server, initial HTTP requests would not be delayed, but only one subsequent flight of requests would be delayed by up to one PTO. Is that correct?

Thus this optimization, and really all of PMTUD probing, assumes that the potential delay of one subsequent flight of HTTP requests by up to one PTO is worth the trade off of potentially increasing the overall connection throughput.

Is that correct?

mxinden avatar Sep 26 '24 19:09 mxinden

Should we also optimistically start the search at the interface MTU

Let me make sure I understand the implications here correctly. Sorry for any potential mistakes. We only start probing once the connection is confirmed.

This would need to change. What I think we should do is roughly this:

  • Start sending at the local interface MTU when the connection starts (i.e., for the Initial).
  • If we detect a loss n times, we revert to pobing up from 1280.

n should probably be something like 2, so we don't cause undue delay.

larseggert avatar Sep 27 '24 05:09 larseggert

In the case where a client's path MTU is smaller than their local interface MTU, this would add a delay of 2*PTO to every connection establishment, right? If so, isn't that a high cost for the potential increase in throughput? Or is such scenario just very rare?

mxinden avatar Sep 27 '24 07:09 mxinden

Yes. I think this is a rare case, but maybe we add some telemetry first to confirm?

We could also cache a probed MTU towards a destination IP, like the OS does for a TCP MSS it has determined.

larseggert avatar Sep 27 '24 07:09 larseggert

Yes. I think this is a rare case, but maybe we add some telemetry first to confirm?

How about we:

  1. Role out PMTUD on Firefox Nightly without the optimization (i.e. starting at the local interface MTU).
  2. Enable the optimization, monitoring whether connection establishment times stay stable.

mxinden avatar Sep 27 '24 08:09 mxinden

I was thinking we just log the log the local interface MTU together with the discovered PMTUD, and check for differences.

larseggert avatar Sep 27 '24 08:09 larseggert

We need to bump the rayon crate in mozilla-central to 1.7, which is what windows-bindgen requires, before we can land this PR and bring in mtu as a dependency.

(See https://github.com/larseggert/neqo/actions/runs/11972576345/job/33379768209#step:8:86 for Firefox build failure.)

@mxinden @valenting @KershawChang any tips on how to do this?

larseggert avatar Nov 25 '24 07:11 larseggert

We will need to update the following crates:

➜  cargo tree --invert=rayon
rayon v1.6.1
├── style v0.0.1
│   ├── geckoservo v0.0.1
│   │   └── gkrust-shared v0.1.0
│   │       ├── gkrust v0.1.0
│   │       └── gkrust-gtest v0.1.0
│   └── stylo_tests v0.0.1
│       [dev-dependencies]
│       └── gkrust v0.1.0
├── webrender v0.62.0
│   └── webrender_bindings v0.1.0
│       └── gkrust-shared v0.1.0
├── webrender_bindings v0.1.0
└── wr_glyph_rasterizer v0.1.0
    └── webrender v0.62.0

Note that, given rayon 1.7 is a minor release only, i.e. non-breaking, we might get away with a subset.

For the record, windows-bindgen introduced rayon v1.7 dependency in windows-bindgen v0.49.0. If it would have been a more recent version, it might have been worth exploring supporting both the most recent and the one without rayon, similar to what we did in https://github.com/quinn-rs/quinn/actions/runs/11553182968/job/32153809202?pr=2021.

mxinden avatar Nov 25 '24 08:11 mxinden

➜  mozilla-central git:(bookmarks/central) cargo update -p rayon  --precise 1.7.0
    Updating crates.io index
    Updating rayon v1.6.1 -> v1.7.0

Neat. It seems to apply without issues.

@larseggert I can create a mozilla-central Phabricator patch if you want. In case there aren't any hidden issues, the only real work will be auditing rayon v1.6.1 -> v1.7.0 code changes.

mxinden avatar Nov 25 '24 08:11 mxinden

@larseggert I can create a mozilla-central Phabricator patch if you want. In case there aren't any hidden issues, the only real work will be auditing rayon v1.6.1 -> v1.7.0 code changes.

That would be great, thanks. (I'm surprised mozilla-central doesn't have some sort of auto-upgrade of dependencies...)

larseggert avatar Nov 25 '24 08:11 larseggert

For the record, updating mozilla-central to a recent rayon version is happening through https://bugzilla.mozilla.org/show_bug.cgi?id=1933199.

mxinden avatar Nov 25 '24 16:11 mxinden

For the record, updating mozilla-central to a recent rayon version is happening through https://bugzilla.mozilla.org/show_bug.cgi?id=1933199.

rayon v1.10.0 landed in mozilla-central last night.

mxinden avatar Nov 29 '24 07:11 mxinden

Thanks, saw it. Trying to make mtu importable.

larseggert avatar Nov 29 '24 07:11 larseggert

Next blocker:

Missing audit for windows-bindgen:0.58.0 (requires ['safe-to-deploy']).

larseggert avatar Nov 29 '24 08:11 larseggert

@mxinden should we update more deps on m-c before merging this, or do we want to handle that when we vendor neqo into m-c the next time?

larseggert avatar Dec 17 '24 11:12 larseggert

Other than the rayon (:heavy_check_mark:) update and the windows-bindgen audit, what else is missing @larseggert?

mxinden avatar Dec 17 '24 13:12 mxinden

Other than the rayon (:heavy_check_mark:) update and the windows-bindgen audit, what else is missing @larseggert?

I think windows-metadata, but that is in the same category as windows-bindgen.

larseggert avatar Dec 17 '24 14:12 larseggert

If it is audits only, I suggest to "handle that when we vendor neqo into m-c the next time".

Though note, auditing windows-bindgen from scratch might be a significant undertaking.

mxinden avatar Dec 17 '24 14:12 mxinden

Though note, auditing windows-bindgen from scratch might be a significant undertaking.

I don't think we need to - the author is trusted.

larseggert avatar Dec 18 '24 14:12 larseggert