fix: Put `qlog` behind a feature flag
Also remove qlog::add_event_data_now, since we need to pass a timestamp around more places now anyhow. And rename QlogMetric to qlog::Metric.
~This is pretty intrusive. Anyone got an idea how to make this cleaner?~ Edit: It's a bit better now.
Fixes #1894
Thanks for exploring the change.
This adds a lot of noise. Thus I suggest only proceeding here in case it has a significant performance win.
What will we do on the Firefox side? Currently one can dynamically enable qlog logging at runtime. Would we always enable the feature in Firefox? If so, how we run our Neqo tests would diverge to how we run Neqo in Firefox.
I would disable in release builds. So we ever ask for qlogs for bugzilla tickets?
Do we ever ask for qlogs for bugzilla tickets?
Sometimes, yes. Though maybe we can deliver a custom build in those cases.
Because most of the work is deferred and uses closures, the compiler will be able to recognize that the closures aren't run, so it can erase that code. I don't know how far up the call tree that will go in terms of killing code, but it should be far enough.
That's what I was wondering, because if I only stub it out in neqo-common, only that crate will gain a qlog feature and hence all the other crates should be unaware of what that is set to and generate code for qlogging, even if that feature is off?
Let me see what it would look like if we also stubbed out qlog in the other crates.
Codecov Report
:x: Patch coverage is 92.00000% with 2 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 93.10%. Comparing base (b9c32c7) to head (88508a1).
:warning: Report is 91 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #3005 +/- ##
==========================================
- Coverage 93.41% 93.10% -0.32%
==========================================
Files 124 124
Lines 36234 36254 +20
Branches 36234 36254 +20
==========================================
- Hits 33847 33753 -94
- Misses 1540 1655 +115
+ Partials 847 846 -1
| Components | Coverage Δ | |
|---|---|---|
| neqo-common | 95.64% <ø> (-1.68%) |
:arrow_down: |
| neqo-crypto | 83.25% <ø> (-0.48%) |
:arrow_down: |
| neqo-http3 | 92.88% <91.66%> (-0.42%) |
:arrow_down: |
| neqo-qpack | 94.18% <ø> (ø) |
|
| neqo-transport | 94.33% <100.00%> (-0.15%) |
:arrow_down: |
| neqo-udp | 78.94% <ø> (-0.48%) |
:arrow_down: |
| mtu | 85.76% <ø> (ø) |
Bencher Report
| Branch | feat-1894 |
| Testbed | On-prem |
🚨 1 Alert
| Iteration | Benchmark | Measure Units | View | Benchmark Result (Result Δ%) | Upper Boundary (Limit %) |
|---|---|---|---|---|---|
| 2 | neqo vs. google (cubic, paced) | Latency milliseconds (ms) | 📈 plot 🚷 threshold 🚨 alert (🔔) | 779.04 ms(+2.54%)Baseline: 759.76 ms | 775.08 ms (100.51%) |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| google vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 279.27 ms(+0.33%)Baseline: 278.35 ms | 282.74 ms (98.78%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| msquic vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 228.25 ms(+14.54%)Baseline: 199.27 ms | 236.95 ms (96.33%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. google (cubic, paced) | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 779.04 ms(+2.54%)Baseline: 759.76 ms | 775.08 ms (100.51%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. msquic (cubic, paced) | 📈 view plot 🚷 view threshold | 157.95 ms(+0.11%)Baseline: 157.78 ms | 160.60 ms (98.35%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. neqo (cubic) | 📈 view plot 🚷 view threshold | 93.01 ms(+1.60%)Baseline: 91.55 ms | 96.86 ms (96.03%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 92.23 ms(-0.71%)Baseline: 92.90 ms | 98.08 ms (94.04%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. neqo (reno) | 📈 view plot 🚷 view threshold | 91.84 ms(+0.34%)Baseline: 91.53 ms | 96.68 ms (95.00%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. neqo (reno, paced) | 📈 view plot 🚷 view threshold | 94.95 ms(+2.33%)Baseline: 92.79 ms | 97.78 ms (97.11%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. quiche (cubic, paced) | 📈 view plot 🚷 view threshold | 194.30 ms(+0.34%)Baseline: 193.64 ms | 196.96 ms (98.65%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| neqo vs. s2n (cubic, paced) | 📈 view plot 🚷 view threshold | 220.83 ms(-0.14%)Baseline: 221.14 ms | 224.10 ms (98.54%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| quiche vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 156.69 ms(+2.33%)Baseline: 153.13 ms | 158.47 ms (98.88%) |
| Benchmark | Latency | Benchmark Result milliseconds (ms) (Result Δ%) | Upper Boundary milliseconds (ms) (Limit %) |
|---|---|---|---|
| s2n vs. neqo (cubic, paced) | 📈 view plot 🚷 view threshold | 169.97 ms(-2.19%)Baseline: 173.77 ms | 178.02 ms (95.47%) |
Bencher Report
| Branch | feat-1894 |
| Testbed | On-prem |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result nanoseconds (ns) (Result Δ%) | Upper Boundary nanoseconds (ns) (Limit %) |
|---|---|---|---|
| 1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client | 📈 view plot 🚷 view threshold | 212,540,000.00 ns(+2.78%)Baseline: 206,793,948.13 ns | 216,504,270.13 ns (98.17%) |
| 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client | 📈 view plot 🚷 view threshold | 205,260,000.00 ns(+2.16%)Baseline: 200,914,005.76 ns | 211,157,536.76 ns (97.21%) |
| 1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client | 📈 view plot 🚷 view threshold | 38,943,000.00 ns(+22.84%)Baseline: 31,702,706.05 ns | 43,003,109.02 ns (90.56%) |
| 1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client | 📈 view plot 🚷 view threshold | 291,060,000.00 ns(-0.10%)Baseline: 291,350,518.73 ns | 303,600,377.09 ns (95.87%) |
| 1-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 119,170,000.00 ns(+0.42%)Baseline: 118,677,463.98 ns | 120,711,446.87 ns (98.72%) |
| 1-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 582,310.00 ns(-1.74%)Baseline: 592,610.29 ns | 615,409.62 ns (94.62%) |
| 1000-streams/each-1-bytes/simulated-time | 📈 view plot 🚷 view threshold | 2,331,000,000.00 ns(-82.64%)Baseline: 13,426,680,979.83 ns | 23,178,179,829.53 ns (10.06%) |
| 1000-streams/each-1-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 12,703,000.00 ns(-8.19%)Baseline: 13,836,132.56 ns | 15,087,187.42 ns (84.20%) |
| 1000-streams/each-1000-bytes/simulated-time | 📈 view plot 🚷 view threshold | 16,161,000,000.00 ns(-13.33%)Baseline: 18,645,951,008.65 ns | 20,643,862,366.60 ns (78.28%) |
| 1000-streams/each-1000-bytes/wallclock-time | 📈 view plot 🚷 view threshold | 51,328,000.00 ns(+1.29%)Baseline: 50,675,239.19 ns | 57,058,656.37 ns (89.96%) |
| RxStreamOrderer::inbound_frame() | 📈 view plot 🚷 view threshold | 109,790,000.00 ns(+0.08%)Baseline: 109,697,665.71 ns | 111,585,865.13 ns (98.39%) |
| coalesce_acked_from_zero 1+1 entries | 📈 view plot 🚷 view threshold | 89.47 ns(+0.66%)Baseline: 88.87 ns | 90.14 ns (99.25%) |
| coalesce_acked_from_zero 10+1 entries | 📈 view plot 🚷 view threshold | 105.81 ns(-0.25%)Baseline: 106.08 ns | 107.22 ns (98.69%) |
| coalesce_acked_from_zero 1000+1 entries | 📈 view plot 🚷 view threshold | 91.80 ns(+1.63%)Baseline: 90.33 ns | 94.94 ns (96.69%) |
| coalesce_acked_from_zero 3+1 entries | 📈 view plot 🚷 view threshold | 106.37 ns(-0.20%)Baseline: 106.59 ns | 107.67 ns (98.79%) |
| decode 1048576 bytes, mask 3f | 📈 view plot 🚷 view threshold | 1,762,400.00 ns(+7.75%)Baseline: 1,635,675.79 ns | 1,811,574.79 ns (97.29%) |
| decode 1048576 bytes, mask 7f | 📈 view plot 🚷 view threshold | 5,059,000.00 ns(-0.15%)Baseline: 5,066,441.79 ns | 5,112,867.74 ns (98.95%) |
| decode 1048576 bytes, mask ff | 📈 view plot 🚷 view threshold | 3,015,100.00 ns(-0.47%)Baseline: 3,029,272.33 ns | 3,053,892.36 ns (98.73%) |
| decode 4096 bytes, mask 3f | 📈 view plot 🚷 view threshold | 6,244.30 ns(-15.05%)Baseline: 7,350.41 ns | 10,365.47 ns (60.24%) |
| decode 4096 bytes, mask 7f | 📈 view plot 🚷 view threshold | 19,607.00 ns(-0.98%)Baseline: 19,801.38 ns | 20,462.67 ns (95.82%) |
| decode 4096 bytes, mask ff | 📈 view plot 🚷 view threshold | 11,338.00 ns(-0.24%)Baseline: 11,365.34 ns | 12,518.55 ns (90.57%) |
| sent::Packets::take_ranges | 📈 view plot 🚷 view threshold | 4,545.50 ns(-3.70%)Baseline: 4,719.98 ns | 4,959.22 ns (91.66%) |
| transfer/pacing-false/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,234,000,000.00 ns(-0.69%)Baseline: 25,409,179,710.14 ns | 26,027,111,806.36 ns (96.95%) |
| transfer/pacing-false/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 24,493,000.00 ns(-4.62%)Baseline: 25,679,956.52 ns | 27,106,506.30 ns (90.36%) |
| transfer/pacing-false/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 25,193,000,000.00 ns(+0.07%)Baseline: 25,175,544,927.54 ns | 25,224,734,939.74 ns (99.87%) |
| transfer/pacing-false/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 25,125,000.00 ns(-2.48%)Baseline: 25,763,759.42 ns | 27,409,475.32 ns (91.67%) |
| transfer/pacing-true/same-seed/simulated-time/run | 📈 view plot 🚷 view threshold | 25,301,000,000.00 ns(-1.07%)Baseline: 25,575,881,159.42 ns | 25,885,357,263.99 ns (97.74%) |
| transfer/pacing-true/same-seed/wallclock-time/run | 📈 view plot 🚷 view threshold | 26,212,000.00 ns(-3.01%)Baseline: 27,026,626.09 ns | 28,600,519.26 ns (91.65%) |
| transfer/pacing-true/varying-seeds/simulated-time/run | 📈 view plot 🚷 view threshold | 24,998,000,000.00 ns(+0.01%)Baseline: 24,995,269,565.22 ns | 25,043,744,345.85 ns (99.82%) |
| transfer/pacing-true/varying-seeds/wallclock-time/run | 📈 view plot 🚷 view threshold | 25,369,000.00 ns(-3.41%)Baseline: 26,264,742.03 ns | 27,991,141.52 ns (90.63%) |
I am hesitant to disable qlog in Firefox at compile time.
We'd need to carefully explain to a user how to generate a qlog anyway. So I think @martinthomson's suggestion to ship them a custom build is therefore workable. Do you see a reason that wouldn't work?
Would work, yes. That said, I think we should make recording a qlog as low-friction as possible. Currently one has to simply set a pref. Shipping a custom build instead, of their specific Firefox version, for their specific OS version, is significantly more involved than that.
Yes qlog has an overhead. But compared to the many other low-hanging fruits we have in Firefox, is the above complexity worth the gain?
I am still puzzled why it shows up in profiles at all. We use Rc in various other hot paths. How is the Rc usage of qlog different?
One way forward would be to merge this with a default to "on", so we can more easily flip that later.
One way forward would be to merge this with a default to "on", so we can more easily flip that later.
I did this now.
CodSpeed Performance Report
Merging #3005 will improve performances by 11.65%
Comparing larseggert:feat-1894 (88508a1) with main (b9c32c7)
Summary
⚡ 1 improvement
✅ 22 untouched
Benchmarks breakdown
| Mode | Benchmark | BASE |
HEAD |
Change | |
|---|---|---|---|---|---|
| ⚡ | Simulation | client |
852.3 ms | 763.4 ms | +11.65% |
Failed Interop Tests
QUIC Interop Runner, client vs. server, differences relative to b9c32c70e273cd89f25d7f0561e01a083b8bdf03.
neqo-latest as client
- neqo-latest vs. aioquic: A :warning:C1
- neqo-latest vs. go-x-net: A BP BA
- neqo-latest vs. haproxy: :rocket:~~M~~ A :rocket:~~C1~~ BP BA
- neqo-latest vs. kwik: BP BA
- neqo-latest vs. linuxquic: A :warning:L1 C1
- neqo-latest vs. lsquic: L1 C1
- neqo-latest vs. msquic: R :rocket:~~Z~~ A L1 C1
- neqo-latest vs. mvfst: A L1 :warning:C1 BA
- neqo-latest vs. nginx: A L1 C1 BP BA
- neqo-latest vs. ngtcp2: A :rocket:~~C1~~ CM
- neqo-latest vs. picoquic: :rocket:~~R~~ :warning:Z A L1 C1
- neqo-latest vs. quic-go: A :warning:L1 C1
- neqo-latest vs. quiche: A L1 :warning:C1 BP BA
- neqo-latest vs. quinn: A :rocket:~~L1~~
- neqo-latest vs. s2n-quic: A :rocket:~~BP~~ BA CM
- neqo-latest vs. tquic: S A BP BA
- neqo-latest vs. xquic: run cancelled after 20 min
neqo-latest as server
- aioquic vs. neqo-latest: :warning:L1 BP CM
- go-x-net vs. neqo-latest: CM
- kwik vs. neqo-latest: BP BA CM
- linuxquic vs. neqo-latest: :warning:BP
- msquic vs. neqo-latest: U :rocket:~~BP~~ CM
- mvfst vs. neqo-latest: Z A L1 C1 CM
- neqo vs. neqo-latest: :warning:C1
- openssl vs. neqo-latest: LR M A CM
- quic-go vs. neqo-latest: CM
- quiche vs. neqo-latest: CM
- quinn vs. neqo-latest: V2 CM
- s2n-quic vs. neqo-latest: CM
- tquic vs. neqo-latest: CM
- xquic vs. neqo-latest: M CM
All results
Succeeded Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: H DC LR C20 M S R Z 3 B U L1 L2 :warning:C1 C2 6 V2 BP BA
- neqo-latest vs. go-x-net: H DC LR M B U L2 C2 6
- neqo-latest vs. haproxy: H DC LR C20 :rocket:~~M~~ S R Z 3 B U L1 L2 :rocket:~~C1~~ C2 6 V2
- neqo-latest vs. kwik: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
- neqo-latest vs. linuxquic: H DC LR C20 M S R Z 3 B U E :warning:L1 L2 C2 6 V2 BP BA CM
- neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A L2 C2 6 V2 BP BA CM
- neqo-latest vs. msquic: H DC LR C20 M S :rocket:~~Z~~ B U L2 C2 6 V2 BP BA
- neqo-latest vs. mvfst: H DC LR M R Z 3 B U L2 :warning:C1 C2 6 BP :warning:BA
- neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U L2 C2 6
- neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E L1 L2 :rocket:~~C1~~ C2 6 V2 BP BA
- neqo-latest vs. picoquic: H DC LR C20 M S :warning:Z :rocket:~~R~~ 3 B U E L2 C2 6 V2 BP BA
- neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U :warning:L1 L2 C2 6 BP BA
- neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U L2 :warning:C1 C2 6
- neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E :rocket:~~L1~~ L2 C1 C2 6 BP BA
- neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E L1 L2 C1 C2 6 :rocket:~~BP~~
- neqo-latest vs. tquic: H DC LR C20 M R Z 3 B U L1 L2 C1 C2 6
neqo-latest as server
- aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U A :warning:L1 L2 C1 C2 6 V2 :warning:BP BA
- chrome vs. neqo-latest: 3
- go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6 BP BA
- kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
- linuxquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 :warning:BP BA CM
- lsquic vs. neqo-latest: H DC LR C20 M S R 3 B E A L1 L2 C1 C2 6 V2 BP BA CM
- msquic vs. neqo-latest: H DC LR C20 M S R Z B A L1 L2 C1 C2 6 V2 :rocket:~~BP~~ BA
- mvfst vs. neqo-latest: H DC LR M 3 B L2 C2 6 BP BA
- neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 :warning:C1 C2 6 V2 BP BA CM
- ngtcp2 vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- openssl vs. neqo-latest: H DC C20 S R 3 B L2 C2 6 BP BA
- picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- quic-go vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 BP BA
- quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
- quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 BP BA
- s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A L1 L2 C1 C2 6 BP BA
- tquic vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
- xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 BP BA
Unsupported Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: E CM
- neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
- neqo-latest vs. haproxy: E CM
- neqo-latest vs. kwik: E CM
- neqo-latest vs. msquic: 3 E CM
- neqo-latest vs. mvfst: C20 S E V2 CM
- neqo-latest vs. nginx: E V2 CM
- neqo-latest vs. picoquic: CM
- neqo-latest vs. quic-go: E V2 CM
- neqo-latest vs. quiche: E V2 CM
- neqo-latest vs. quinn: V2 CM
- neqo-latest vs. s2n-quic: Z V2
- neqo-latest vs. tquic: E V2 CM
neqo-latest as server
- aioquic vs. neqo-latest: E
- chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
- go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
- kwik vs. neqo-latest: E
- lsquic vs. neqo-latest: Z U
- msquic vs. neqo-latest: 3 E
- mvfst vs. neqo-latest: C20 S R U E V2
- openssl vs. neqo-latest: Z U E L1 C1 V2
- quic-go vs. neqo-latest: E V2
- quiche vs. neqo-latest: C20 U E V2
- s2n-quic vs. neqo-latest: C20 Z U V2
- tquic vs. neqo-latest: C20 U E V2
- xquic vs. neqo-latest: E V2
Benchmark results
Performance differences relative to b9c32c70e273cd89f25d7f0561e01a083b8bdf03.
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: :broken_heart: Performance has regressed.
time: [204.94 ms 205.26 ms 205.59 ms]
thrpt: [486.41 MiB/s 487.18 MiB/s 487.94 MiB/s]
change:
time: [+1.7802% +2.0308% +2.2733%] (p = 0.00 -1.9904% -1.7490%]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: :broken_heart: Performance has regressed.
time: [289.46 ms 291.06 ms 292.73 ms]
thrpt: [34.161 Kelem/s 34.357 Kelem/s 34.548 Kelem/s]
change:
time: [+1.0421% +1.8390% +2.7153%] (p = 0.00 -1.8058% -1.0314%]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: No change in performance detected.
time: [38.742 ms 38.943 ms 39.174 ms]
thrpt: [25.527 B/s 25.679 B/s 25.812 B/s]
change:
time: [-0.2618% +0.4629% +1.2102%] (p = 0.23 > 0.05)
thrpt: [-1.1958% -0.4608% +0.2625%]
Found 11 outliers among 100 measurements (11.00%)
3 (3.00%) high mild
8 (8.00%) high severe
1-conn/1-100mb-req/mtu-1504 (aka. Upload)/client: :broken_heart: Performance has regressed.
time: [212.24 ms 212.54 ms 212.88 ms]
thrpt: [469.74 MiB/s 470.50 MiB/s 471.17 MiB/s]
change:
time: [+1.2497% +1.4713% +1.6921%] (p = 0.00 -1.4500% -1.2343%]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high severe
decode 4096 bytes, mask ff: No change in performance detected.
time: [11.307 µs 11.338 µs 11.377 µs]
change: [-1.1068% -0.4029% +0.1505%] (p = 0.24 > 0.05)
Found 18 outliers among 100 measurements (18.00%)
2 (2.00%) low severe
6 (6.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe
decode 1048576 bytes, mask ff: No change in performance detected.
time: [2.9970 ms 3.0151 ms 3.0417 ms]
change: [-0.1969% +0.5183% +1.3522%] (p = 0.24 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
11 (11.00%) high severe
decode 4096 bytes, mask 7f: No change in performance detected.
time: [19.561 µs 19.607 µs 19.659 µs]
change: [-0.7654% -0.3012% +0.1011%] (p = 0.18 > 0.05)
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low severe
3 (3.00%) high mild
9 (9.00%) high severe
decode 1048576 bytes, mask 7f: No change in performance detected.
time: [5.0379 ms 5.0590 ms 5.0856 ms]
change: [-0.2910% +0.2424% +0.9024%] (p = 0.41 > 0.05)
Found 17 outliers among 100 measurements (17.00%)
17 (17.00%) high severe
decode 4096 bytes, mask 3f: No change in performance detected.
time: [6.2086 µs 6.2443 µs 6.2874 µs]
change: [-0.0871% +0.4587% +1.0706%] (p = 0.13 > 0.05)
Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high severe
decode 1048576 bytes, mask 3f: No change in performance detected.
time: [1.7578 ms 1.7624 ms 1.7700 ms]
change: [-0.1523% +0.1878% +0.6080%] (p = 0.46 > 0.05)
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high severe
1-streams/each-1000-bytes/wallclock-time: Change within noise threshold.
time: [581.63 µs 582.31 µs 582.98 µs]
change: [-1.8570% -1.2554% -0.7608%] (p = 0.00 1-streams/each-1000-bytes/simulated-time: No change in performance detected.
time: [118.96 ms 119.17 ms 119.38 ms]
thrpt: [8.1801 KiB/s 8.1947 KiB/s 8.2095 KiB/s]
change:
time: [-0.0123% +0.2533% +0.5218%] (p = 0.06 > 0.05)
thrpt: [-0.5190% -0.2527% +0.0123%]
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1000-streams/each-1-bytes/wallclock-time: No change in performance detected.
time: [12.659 ms 12.703 ms 12.747 ms]
change: [-0.7218% -0.1683% +0.3682%] (p = 0.57 > 0.05)
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1000-streams/each-1-bytes/simulated-time
time: [2.3275 s 2.3310 s 2.3345 s]
thrpt: [428.36 B/s 429.01 B/s 429.65 B/s]
change:
time: [-0.3193% -0.1129% +0.1008%] (p = 0.30 > 0.05)
thrpt: [-0.1007% +0.1130% +0.3204%]
No change in performance detected.
1000-streams/each-1000-bytes/wallclock-time: Change within noise threshold.
time: [51.202 ms 51.328 ms 51.454 ms]
change: [+0.6878% +1.0238% +1.3794%] (p = 0.00 1000-streams/each-1000-bytes/simulated-time: No change in performance detected.
time: [15.909 s 16.161 s 16.410 s]
thrpt: [59.509 KiB/s 60.426 KiB/s 61.385 KiB/s]
change:
time: [-3.4056% -1.1156% +1.2662%] (p = 0.33 > 0.05)
thrpt: [-1.2503% +1.1282% +3.5256%]
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild
coalesce_acked_from_zero 1+1 entries: No change in performance detected.
time: [89.170 ns 89.465 ns 89.762 ns]
change: [-1.0811% -0.3994% +0.1991%] (p = 0.23 > 0.05)
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
coalesce_acked_from_zero 3+1 entries: No change in performance detected.
time: [106.04 ns 106.37 ns 106.73 ns]
change: [-0.5475% -0.0752% +0.3487%] (p = 0.75 > 0.05)
Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
1 (1.00%) high mild
11 (11.00%) high severe
coalesce_acked_from_zero 10+1 entries: No change in performance detected.
time: [105.32 ns 105.81 ns 106.58 ns]
change: [-0.9208% -0.1078% +0.9276%] (p = 0.83 > 0.05)
Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
time: [91.629 ns 91.800 ns 91.980 ns]
change: [-0.6966% -0.0793% +0.5928%] (p = 0.81 > 0.05)
Found 6 outliers among 100 measurements (6.00%)
3 (3.00%) high mild
3 (3.00%) high severe
RxStreamOrderer::inbound_frame(): Change within noise threshold.
time: [109.63 ms 109.79 ms 110.01 ms]
change: [-0.3515% -0.1905% +0.0026%] (p = 0.03 Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severesent::Packets::take_ranges: No change in performance detected.
time: [4.4468 µs 4.5455 µs 4.6355 µs]
change: [-5.2416% -2.0241% +1.0287%] (p = 0.23 > 0.05)
transfer/pacing-false/varying-seeds/wallclock-time/run: Change within noise threshold.
time: [25.087 ms 25.125 ms 25.163 ms]
change: [+0.8368% +1.0845% +1.3192%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/pacing-false/varying-seeds/simulated-time/run: No change in performance detected.
time: [25.157 s 25.193 s 25.229 s]
thrpt: [162.36 KiB/s 162.59 KiB/s 162.82 KiB/s]
change:
time: [-0.2776% -0.0585% +0.1646%] (p = 0.59 > 0.05)
thrpt: [-0.1643% +0.0586% +0.2784%]
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
transfer/pacing-true/varying-seeds/wallclock-time/run: No change in performance detected.
time: [25.309 ms 25.369 ms 25.433 ms]
change: [-0.4367% -0.0774% +0.2626%] (p = 0.68 > 0.05)
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
transfer/pacing-true/varying-seeds/simulated-time/run: No change in performance detected.
time: [24.960 s 24.998 s 25.036 s]
thrpt: [163.61 KiB/s 163.86 KiB/s 164.10 KiB/s]
change:
time: [-0.1913% +0.0276% +0.2323%] (p = 0.80 > 0.05)
thrpt: [-0.2317% -0.0276% +0.1917%]
transfer/pacing-false/same-seed/wallclock-time/run: Change within noise threshold.
time: [24.471 ms 24.493 ms 24.517 ms]
change: [-1.6039% -1.4213% -1.2631%] (p = 0.00 Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severetransfer/pacing-false/same-seed/simulated-time/run: No change in performance detected.
time: [25.234 s 25.234 s 25.234 s]
thrpt: [162.32 KiB/s 162.32 KiB/s 162.32 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
transfer/pacing-true/same-seed/wallclock-time/run: Change within noise threshold.
time: [26.184 ms 26.212 ms 26.245 ms]
change: [+0.5857% +0.7167% +0.8679%] (p = 0.00 Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severetransfer/pacing-true/same-seed/simulated-time/run: No change in performance detected.
time: [25.301 s 25.301 s 25.301 s]
thrpt: [161.89 KiB/s 161.89 KiB/s 161.89 KiB/s]
change:
time: [+0.0000% +0.0000% +0.0000%] (p = NaN > 0.05)
thrpt: [+0.0000% +0.0000% +0.0000%]
Download data for profiler.firefox.com or download performance comparison data.
Client/server transfer results
Performance differences relative to b9c32c70e273cd89f25d7f0561e01a083b8bdf03.
Transfer of 33554432 bytes over loopback, min. 100 runs. All unit-less numbers are in milliseconds.
| Client vs. server (params) | Mean ± σ | Min | Max | MiB/s ± σ | Δ main |
Δ main |
|---|---|---|---|---|---|---|
| google vs. google | 476.2 ± 3.6 | 470.2 | 487.6 | 67.2 ± 8.9 | ||
| google vs. neqo (cubic, paced) | 279.3 ± 4.3 | 270.1 | 288.4 | 114.6 ± 7.4 | :green_heart: -2.6 | -0.9% |
| msquic vs. msquic | 207.2 ± 83.3 | 142.7 | 549.5 | 154.5 ± 0.4 | ||
| msquic vs. neqo (cubic, paced) | 228.2 ± 82.6 | 157.4 | 582.1 | 140.2 ± 0.4 | 5.0 | 2.2% |
| neqo vs. google (cubic, paced) | 779.0 ± 3.5 | 772.6 | 789.5 | 41.1 ± 9.1 | :broken_heart: 1.6 | 0.2% |
| neqo vs. msquic (cubic, paced) | 158.0 ± 4.6 | 150.6 | 166.6 | 202.6 ± 7.0 | 0.5 | 0.3% |
| neqo vs. neqo (cubic) | 93.0 ± 4.2 | 86.1 | 103.5 | 344.0 ± 7.6 | -1.2 | -1.3% |
| neqo vs. neqo (cubic, paced) | 92.2 ± 4.1 | 81.9 | 108.2 | 346.9 ± 7.8 | -0.8 | -0.8% |
| neqo vs. neqo (reno) | 91.8 ± 4.5 | 83.6 | 105.0 | 348.4 ± 7.1 | :green_heart: -2.0 | -2.2% |
| neqo vs. neqo (reno, paced) | 94.9 ± 5.1 | 87.5 | 118.2 | 337.0 ± 6.3 | -0.3 | -0.3% |
| neqo vs. quiche (cubic, paced) | 194.3 ± 5.1 | 187.5 | 216.6 | 164.7 ± 6.3 | :broken_heart: 1.8 | 0.9% |
| neqo vs. s2n (cubic, paced) | 220.8 ± 4.7 | 213.4 | 230.8 | 144.9 ± 6.8 | :green_heart: -2.2 | -1.0% |
| quiche vs. neqo (cubic, paced) | 156.7 ± 4.7 | 143.1 | 167.6 | 204.2 ± 6.8 | :broken_heart: 1.9 | 1.2% |
| quiche vs. quiche | 144.9 ± 4.6 | 137.2 | 156.9 | 220.9 ± 7.0 | ||
| s2n vs. neqo (cubic, paced) | 170.0 ± 4.2 | 161.7 | 182.7 | 188.3 ± 7.6 | 0.1 | 0.0% |
| s2n vs. s2n | 247.4 ± 28.1 | 231.8 | 344.6 | 129.3 ± 1.1 |
Download data for profiler.firefox.com or download performance comparison data.
In regards to feature flagging, @larseggert what do you think of https://github.com/mozilla/neqo/pull/3129/ instead? It is less intrusive, and should provide the same performance characteristics as this pull request.
If you agree, I suggest merging the changes related to passing now here, but not the feature flagging. I would then clean up #3129.
Closing in favor of #3129.