ci: run tests on Android emulator
Pendant to https://github.com/quinn-rs/quinn/pull/1950.
Fixes https://github.com/mozilla/neqo/issues/2028.
Failing with:
android-actions/setup-android@v2 and reactivecircus/android-emulator-runner@v2 are not allowed to be used in mozilla/neqo.
@larseggert where do I find Mozilla's list of allowed GitHub actions?
https://github.com/mozilla/neqo/settings/actions under "General" has the list. Let me know if that is not visible to you.
Failed Interop Tests
QUIC Interop Runner, client vs. server, differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.
neqo-latest as client
- neqo-latest vs. aioquic: Z
- neqo-latest vs. go-x-net: BP BA
- neqo-latest vs. haproxy: BP BA
- neqo-latest vs. lsquic: L1 C1
- neqo-latest vs. msquic: R Z A L1 C1
- neqo-latest vs. mvfst: A L1 C1 BA
- neqo-latest vs. nginx: BP BA
- neqo-latest vs. ngtcp2: CM
- neqo-latest vs. picoquic: A :rocket:~~C1~~
- neqo-latest vs. quic-go: A
- neqo-latest vs. quiche: BP BA
- neqo-latest vs. s2n-quic: BP BA CM
- neqo-latest vs. tquic: S BP BA
- neqo-latest vs. xquic: run cancelled after 20 min
neqo-latest as server
- aioquic vs. neqo-latest: CM
- go-x-net vs. neqo-latest: CM
- kwik vs. neqo-latest: BP BA CM
- lsquic vs. neqo-latest: run cancelled after 20 min
- msquic vs. neqo-latest: Z U CM
- mvfst vs. neqo-latest: Z A L1 C1 CM
- openssl vs. neqo-latest: LR M CM
- quic-go vs. neqo-latest: run cancelled after 20 min
- quiche vs. neqo-latest: CM
- quinn vs. neqo-latest: V2 CM
- s2n-quic vs. neqo-latest: CM
- tquic vs. neqo-latest: CM
- xquic vs. neqo-latest: M CM
All results
Succeeded Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: H DC LR C20 M S R 3 B U A L1 L2 C1 C2 6 V2 BP BA
- neqo-latest vs. go-x-net: H DC LR M B U A L2 C2 6
- neqo-latest vs. haproxy: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
- neqo-latest vs. kwik: :rocket:~~H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2 BP BA~~
- neqo-latest vs. lsquic: H DC LR C20 M S R Z 3 B U E A L2 C2 6 V2 BP BA
- neqo-latest vs. msquic: H DC LR C20 M S B U L2 C2 6 V2 BP BA
- neqo-latest vs. mvfst: H DC LR M R Z 3 B U L2 C2 6 BP
- neqo-latest vs. neqo: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- neqo-latest vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- neqo-latest vs. nginx: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
- neqo-latest vs. ngtcp2: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA
- neqo-latest vs. picoquic: H DC LR C20 M S R Z 3 B U E L1 L2 :rocket:~~C1~~ C2 6 V2 BP BA
- neqo-latest vs. quic-go: H DC LR C20 M S R Z 3 B U L1 L2 C1 C2 6 BP BA
- neqo-latest vs. quiche: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6
- neqo-latest vs. quinn: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 BP BA
- neqo-latest vs. s2n-quic: H DC LR C20 M S R 3 B U E A L1 L2 C1 C2 6
- neqo-latest vs. tquic: H DC LR C20 M R Z 3 B U A L1 L2 C1 C2 6
neqo-latest as server
- aioquic vs. neqo-latest: H DC LR C20 M S R Z 3 B A L1 L2 C1 C2 6 V2 BP BA
- chrome vs. neqo-latest: 3
- go-x-net vs. neqo-latest: H DC LR M B U A L2 C2 6 BP BA
- kwik vs. neqo-latest: H DC LR C20 M S R Z 3 B U A L1 L2 C1 C2 6 V2
- msquic vs. neqo-latest: H DC LR C20 M S R B A L1 L2 C1 C2 6 V2 BP BA
- mvfst vs. neqo-latest: H DC LR M 3 B L2 C2 6 BP BA
- neqo vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- ngtcp2 vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A :rocket:~~L1~~ L2 C1 C2 6 V2 BP BA CM
- openssl vs. neqo-latest: H DC C20 S R 3 B A L2 C2 6 BP BA
- picoquic vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 V2 BP BA CM
- quiche vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
- quinn vs. neqo-latest: H DC LR C20 M S R Z 3 B U E A L1 L2 C1 C2 6 BP BA
- s2n-quic vs. neqo-latest: H DC LR M S R 3 B E A L1 L2 C1 C2 6 BP BA
- tquic vs. neqo-latest: H DC LR M S R Z 3 B A L1 L2 C1 C2 6 BP BA
- xquic vs. neqo-latest: H DC LR C20 S R Z 3 B U A L1 L2 C1 C2 6 BP BA
Unsupported Interop Tests
QUIC Interop Runner, client vs. server
neqo-latest as client
- neqo-latest vs. aioquic: E CM
- neqo-latest vs. go-x-net: C20 S R Z 3 E L1 C1 V2 CM
- neqo-latest vs. haproxy: E CM
- neqo-latest vs. kwik: E CM
- neqo-latest vs. lsquic: CM
- neqo-latest vs. msquic: 3 E CM
- neqo-latest vs. mvfst: C20 S E V2 CM
- neqo-latest vs. nginx: E V2 CM
- neqo-latest vs. picoquic: CM
- neqo-latest vs. quic-go: E V2 CM
- neqo-latest vs. quiche: E V2 CM
- neqo-latest vs. quinn: V2 CM
- neqo-latest vs. s2n-quic: Z V2
- neqo-latest vs. tquic: E V2 CM
neqo-latest as server
- aioquic vs. neqo-latest: U E
- chrome vs. neqo-latest: H DC LR C20 M S R Z B U E A L1 L2 C1 C2 6 V2 BP BA CM
- go-x-net vs. neqo-latest: C20 S R Z 3 E L1 C1 V2
- kwik vs. neqo-latest: E
- msquic vs. neqo-latest: 3 E
- mvfst vs. neqo-latest: C20 S R U E V2
- openssl vs. neqo-latest: Z U E L1 C1 V2
- quiche vs. neqo-latest: C20 U E V2
- s2n-quic vs. neqo-latest: C20 Z U V2
- tquic vs. neqo-latest: C20 U E V2
- xquic vs. neqo-latest: E V2
Firefox builds for this PR
The following builds are available for testing. Crossed-out builds did not succeed.
- Linux: Debug Release
- macOS: Debug Release
- Windows: Debug Release
@mxinden did you ask for the actions to be allowlisted? Should I?
I have not gotten to it, sorry. Help appreciated. I assume reactivecircus/android-emulator-runner is the most important one, i.e. most difficult to write ourselves.
@mxinden these actions should now be available: https://bugzilla.mozilla.org/show_bug.cgi?id=1931529
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 92.94%. Comparing base (
8d27458) to head (e01a4b6). Report is 6 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #2058 +/- ##
==========================================
- Coverage 95.41% 92.94% -2.48%
==========================================
Files 115 115
Lines 36996 36996
Branches 36996 36996
==========================================
- Hits 35300 34385 -915
- Misses 1690 1758 +68
- Partials 6 853 +847
| Components | Coverage Δ | |
|---|---|---|
| neqo-common | 96.12% <ø> (-1.06%) |
:arrow_down: |
| neqo-crypto | 80.54% <ø> (-9.90%) |
:arrow_down: |
| neqo-http3 | 92.67% <ø> (-1.84%) |
:arrow_down: |
| neqo-qpack | 94.21% <ø> (-2.08%) |
:arrow_down: |
| neqo-transport | 94.06% <ø> (-2.18%) |
:arrow_down: |
| neqo-udp | 87.05% <ø> (-7.65%) |
:arrow_down: |
Cross-compilation of NSS for Android on Linux is failing. I will look into this.
It's probably because it is grabbing the precompiled Linux build for NSS from the action cache. We need to differentiate between Linux and Android when building/caching NSS.
We need to discuss with the NSS team how they build for Android. I have the suspicion that there are again no instructions, and just the way that the Firefox build does it is working. Topic for Thursday.
I'm giving up. The build system for NSS and esp. NSPR is just too horrible. It appears the NSS folks don't know how to do an Android build either. So until someone figures this out, this PR is blocked.
I'm not sure if this will be useful, but it looks like application-services builds NSS on android: https://github.com/mozilla/application-services/blob/main/libs/build-nss-android.sh.
Benchmark results
Performance differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.
decode 4096 bytes, mask ff: No change in performance detected.
time: [11.740 µs 11.773 µs 11.813 µs]
change: [-0.7101% -0.2612% +0.1297%] (p = 0.23 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
4 (4.00%) low mild
1 (1.00%) high mild
5 (5.00%) high severe
decode 1048576 bytes, mask ff: No change in performance detected.
time: [2.8930 ms 2.9023 ms 2.9134 ms]
change: [-0.3801% +0.0744% +0.5799%] (p = 0.75 > 0.05)
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) high mild
9 (9.00%) high severe
decode 4096 bytes, mask 7f: No change in performance detected.
time: [19.633 µs 19.688 µs 19.751 µs]
change: [-0.2990% +0.2405% +0.8626%] (p = 0.43 > 0.05)
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low mild
14 (14.00%) high severe
decode 1048576 bytes, mask 7f: No change in performance detected.
time: [4.7057 ms 4.7174 ms 4.7302 ms]
change: [-0.3109% +0.0602% +0.4265%] (p = 0.75 > 0.05)
Found 14 outliers among 100 measurements (14.00%)
14 (14.00%) high severe
decode 4096 bytes, mask 3f: No change in performance detected.
time: [6.2014 µs 6.2341 µs 6.2731 µs]
change: [-0.3412% +0.3133% +1.0449%] (p = 0.38 > 0.05)
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) high mild
7 (7.00%) high severe
decode 1048576 bytes, mask 3f: No change in performance detected.
time: [2.1051 ms 2.1120 ms 2.1190 ms]
change: [-0.1987% +0.1973% +0.5912%] (p = 0.35 > 0.05)
Found 9 outliers among 100 measurements (9.00%)
2 (2.00%) low mild
7 (7.00%) high severe
1 streams of 1 bytes/multistream: :broken_heart: Performance has regressed.
time: [68.624 µs 68.748 µs 68.877 µs]
change: [+5.7387% +6.0370% +6.3048%] (p = 0.00 1000 streams of 1 bytes/multistream: :broken_heart: Performance has regressed.
time: [24.535 ms 24.574 ms 24.614 ms]
change: [+1.9679% +2.1815% +2.3986%] (p = 0.00 10000 streams of 1 bytes/multistream: No change in performance detected.
time: [1.6450 s 1.6465 s 1.6480 s]
change: [-0.2568% -0.1179% +0.0141%] (p = 0.09 > 0.05)
Found 21 outliers among 100 measurements (21.00%)
3 (3.00%) low severe
10 (10.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
1 streams of 1000 bytes/multistream: :broken_heart: Performance has regressed.
time: [70.015 µs 70.572 µs 71.585 µs]
change: [+5.9432% +6.8591% +8.6440%] (p = 0.00 Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe100 streams of 1000 bytes/multistream: Change within noise threshold.
time: [3.2498 ms 3.2571 ms 3.2651 ms]
change: [+0.0659% +0.3759% +0.6923%] (p = 0.02 Found 21 outliers among 100 measurements (21.00%)
21 (21.00%) high severe1000 streams of 1000 bytes/multistream: :green_heart: Performance has improved.
time: [137.86 ms 137.93 ms 138.01 ms]
change: [-4.6993% -4.6211% -4.5483%] (p = 0.00 coalesce_acked_from_zero 1+1 entries: No change in performance detected.
time: [92.271 ns 92.583 ns 92.894 ns]
change: [-0.3052% +0.0953% +0.4886%] (p = 0.64 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
10 (10.00%) high mild
1 (1.00%) high severe
coalesce_acked_from_zero 3+1 entries: No change in performance detected.
time: [109.78 ns 110.13 ns 110.51 ns]
change: [-0.3723% -0.0072% +0.3742%] (p = 0.97 > 0.05)
Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
10 (10.00%) high severe
coalesce_acked_from_zero 10+1 entries: No change in performance detected.
time: [109.40 ns 109.73 ns 110.15 ns]
change: [-0.6774% -0.0390% +0.6586%] (p = 0.91 > 0.05)
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
6 (6.00%) low mild
4 (4.00%) high severe
coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
time: [91.943 ns 91.984 ns 92.032 ns]
change: [-7.8172% -2.2377% +1.3701%] (p = 0.57 > 0.05)
Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) high mild
10 (10.00%) high severe
RxStreamOrderer::inbound_frame(): Change within noise threshold.
time: [115.56 ms 115.61 ms 115.66 ms]
change: [+0.1057% +0.1591% +0.2140%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
7 (7.00%) low severe
10 (10.00%) high severeSentPackets::take_ranges: No change in performance detected.
time: [5.2878 µs 5.4737 µs 5.6721 µs]
change: [-5.0891% -1.4449% +2.0061%] (p = 0.46 > 0.05)
Found 7 outliers among 100 measurements (7.00%)
7 (7.00%) high mild
transfer/pacing-false/varying-seeds: Change within noise threshold.
time: [34.385 ms 34.452 ms 34.519 ms]
change: [-2.2350% -1.9757% -1.7051%] (p = 0.00 transfer/pacing-true/varying-seeds: Change within noise threshold.
time: [34.559 ms 34.616 ms 34.674 ms]
change: [-1.7091% -1.4692% -1.2245%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mildtransfer/pacing-false/same-seed: Change within noise threshold.
time: [34.500 ms 34.555 ms 34.610 ms]
change: [-1.2164% -0.9952% -0.7644%] (p = 0.00 transfer/pacing-true/same-seed: Change within noise threshold.
time: [34.768 ms 34.817 ms 34.866 ms]
change: [-0.7794% -0.5600% -0.3379%] (p = 0.00 1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
time: [2.2016 s 2.2088 s 2.2162 s]
thrpt: [45.123 MiB/s 45.272 MiB/s 45.421 MiB/s]
change:
time: [-0.3991% +0.1008% +0.6041%] (p = 0.69 > 0.05)
thrpt: [-0.6005% -0.1007% +0.4007%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
time: [387.38 ms 389.24 ms 391.16 ms]
thrpt: [25.565 Kelem/s 25.691 Kelem/s 25.814 Kelem/s]
change:
time: [-0.2711% +0.4672% +1.2017%] (p = 0.22 > 0.05)
thrpt: [-1.1874% -0.4650% +0.2719%]
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
2 (2.00%) high mild
1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: :green_heart: Performance has improved.
time: [27.088 ms 27.784 ms 28.493 ms]
thrpt: [35.096 elem/s 35.993 elem/s 36.917 elem/s]
change:
time: [-10.788% -7.6617% -4.1286%] (p = 0.00 +8.2974% +12.093%]
1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: :green_heart: Performance has improved.
time: [3.1455 s 3.1632 s 3.1811 s]
thrpt: [31.436 MiB/s 31.614 MiB/s 31.792 MiB/s]
change:
time: [-12.180% -11.399% -10.625%] (p = 0.00 +12.865% +13.869%]
Client/server transfer results
Performance differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.
Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.
| Client | Server | CC | Pacing | Mean ± σ | Min | Max | Δ main |
Δ main |
|---|---|---|---|---|---|---|---|---|
| neqo | neqo | reno | on | 530.7 ± 69.4 | 457.2 | 718.0 | 23.4 | 1.1% |
| neqo | neqo | reno | 577.0 ± 240.0 | 453.1 | 1631.3 | 41.0 | 1.8% | |
| neqo | neqo | cubic | on | 528.6 ± 51.5 | 462.1 | 674.0 | -4.0 | -0.2% |
| neqo | neqo | cubic | 525.8 ± 46.7 | 466.3 | 659.5 | :broken_heart: 21.1 | 1.0% | |
| neqo | reno | on | 913.3 ± 94.0 | 680.1 | 1043.5 | 15.0 | 0.4% | |
| neqo | reno | 908.8 ± 103.3 | 649.6 | 1084.9 | -1.1 | -0.0% | ||
| neqo | cubic | on | 903.0 ± 93.7 | 642.1 | 1063.2 | 12.9 | 0.4% | |
| neqo | cubic | 901.6 ± 104.7 | 658.1 | 1087.8 | 17.4 | 0.5% | ||
| 560.8 ± 46.8 | 528.9 | 737.1 | 8.5 | 0.4% | ||||
| neqo | msquic | reno | on | 247.9 ± 77.9 | 202.0 | 613.8 | 12.5 | 1.3% |
| neqo | msquic | reno | 234.7 ± 42.9 | 203.4 | 437.4 | 6.7 | 0.7% | |
| neqo | msquic | cubic | on | 226.5 ± 24.7 | 200.4 | 305.4 | 5.2 | 0.6% |
| neqo | msquic | cubic | 229.6 ± 33.9 | 202.7 | 373.1 | -11.4 | -1.2% | |
| msquic | msquic | 120.3 ± 29.4 | 99.5 | 264.0 | -0.7 | -0.1% |
OK, this finally builds NSS. It's now dying on MTU not supporting android yet, which is my next fix.
This is now waiting for https://github.com/mozilla/mtu/pull/96
This is now at a point where the simulator finally executes the tests. There are many failures.
@mxinden this is now ready for review, the tests pass (but check for FIXME).
@martinthomson please take a look esp. at the change to neqo-crypto that allows overriding the NSS database directory? I don't think there is a security issue here, but YMMV.
@mxinden take another look, I backed out the vvec changes. I can force merge if you think this is ready.