neqo icon indicating copy to clipboard operation
neqo copied to clipboard

ci: run tests on Android emulator

Open mxinden opened this issue 1 year ago • 4 comments

Pendant to https://github.com/quinn-rs/quinn/pull/1950.

Fixes https://github.com/mozilla/neqo/issues/2028.

mxinden avatar Aug 19 '24 12:08 mxinden

Failing with:

android-actions/setup-android@v2 and reactivecircus/android-emulator-runner@v2 are not allowed to be used in mozilla/neqo.

@larseggert where do I find Mozilla's list of allowed GitHub actions?

mxinden avatar Aug 19 '24 12:08 mxinden

https://github.com/mozilla/neqo/settings/actions under "General" has the list. Let me know if that is not visible to you.

larseggert avatar Aug 19 '24 12:08 larseggert

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

github-actions[bot] avatar Aug 19 '24 12:08 github-actions[bot]

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

  • Linux: Debug Release
  • macOS: Debug Release
  • Windows: Debug Release

github-actions[bot] avatar Aug 19 '24 14:08 github-actions[bot]

@mxinden did you ask for the actions to be allowlisted? Should I?

larseggert avatar Nov 15 '24 12:11 larseggert

I have not gotten to it, sorry. Help appreciated. I assume reactivecircus/android-emulator-runner is the most important one, i.e. most difficult to write ourselves.

mxinden avatar Nov 15 '24 12:11 mxinden

@mxinden these actions should now be available: https://bugzilla.mozilla.org/show_bug.cgi?id=1931529

larseggert avatar Nov 19 '24 16:11 larseggert

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 92.94%. Comparing base (8d27458) to head (e01a4b6). Report is 6 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2058      +/-   ##
==========================================
- Coverage   95.41%   92.94%   -2.48%     
==========================================
  Files         115      115              
  Lines       36996    36996              
  Branches    36996    36996              
==========================================
- Hits        35300    34385     -915     
- Misses       1690     1758      +68     
- Partials        6      853     +847     
Components Coverage Δ
neqo-common 96.12% <ø> (-1.06%) :arrow_down:
neqo-crypto 80.54% <ø> (-9.90%) :arrow_down:
neqo-http3 92.67% <ø> (-1.84%) :arrow_down:
neqo-qpack 94.21% <ø> (-2.08%) :arrow_down:
neqo-transport 94.06% <ø> (-2.18%) :arrow_down:
neqo-udp 87.05% <ø> (-7.65%) :arrow_down:

codecov[bot] avatar Nov 19 '24 17:11 codecov[bot]

Cross-compilation of NSS for Android on Linux is failing. I will look into this.

mxinden avatar Nov 19 '24 19:11 mxinden

It's probably because it is grabbing the precompiled Linux build for NSS from the action cache. We need to differentiate between Linux and Android when building/caching NSS.

larseggert avatar Nov 20 '24 06:11 larseggert

We need to discuss with the NSS team how they build for Android. I have the suspicion that there are again no instructions, and just the way that the Firefox build does it is working. Topic for Thursday.

larseggert avatar Nov 25 '24 12:11 larseggert

I'm giving up. The build system for NSS and esp. NSPR is just too horrible. It appears the NSS folks don't know how to do an Android build either. So until someone figures this out, this PR is blocked.

larseggert avatar Dec 16 '24 15:12 larseggert

I'm not sure if this will be useful, but it looks like application-services builds NSS on android: https://github.com/mozilla/application-services/blob/main/libs/build-nss-android.sh.

jschanck avatar Jan 23 '25 18:01 jschanck

Benchmark results

Performance differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.740 µs 11.773 µs 11.813 µs]
       change: [-0.7101% -0.2612% +0.1297%] (p = 0.23 > 0.05)

Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 4 (4.00%) low mild 1 (1.00%) high mild 5 (5.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [2.8930 ms 2.9023 ms 2.9134 ms]
       change: [-0.3801% +0.0744% +0.5799%] (p = 0.75 > 0.05)

Found 10 outliers among 100 measurements (10.00%) 1 (1.00%) high mild 9 (9.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.633 µs 19.688 µs 19.751 µs]
       change: [-0.2990% +0.2405% +0.8626%] (p = 0.43 > 0.05)

Found 15 outliers among 100 measurements (15.00%) 1 (1.00%) low mild 14 (14.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [4.7057 ms 4.7174 ms 4.7302 ms]
       change: [-0.3109% +0.0602% +0.4265%] (p = 0.75 > 0.05)

Found 14 outliers among 100 measurements (14.00%) 14 (14.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [6.2014 µs 6.2341 µs 6.2731 µs]
       change: [-0.3412% +0.3133% +1.0449%] (p = 0.38 > 0.05)

Found 8 outliers among 100 measurements (8.00%) 1 (1.00%) high mild 7 (7.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [2.1051 ms 2.1120 ms 2.1190 ms]
       change: [-0.1987% +0.1973% +0.5912%] (p = 0.35 > 0.05)

Found 9 outliers among 100 measurements (9.00%) 2 (2.00%) low mild 7 (7.00%) high severe

1 streams of 1 bytes/multistream: :broken_heart: Performance has regressed.
       time:   [68.624 µs 68.748 µs 68.877 µs]
       change: [+5.7387% +6.0370% +6.3048%] (p = 0.00 
1000 streams of 1 bytes/multistream: :broken_heart: Performance has regressed.
       time:   [24.535 ms 24.574 ms 24.614 ms]
       change: [+1.9679% +2.1815% +2.3986%] (p = 0.00 
10000 streams of 1 bytes/multistream: No change in performance detected.
       time:   [1.6450 s 1.6465 s 1.6480 s]
       change: [-0.2568% -0.1179% +0.0141%] (p = 0.09 > 0.05)

Found 21 outliers among 100 measurements (21.00%) 3 (3.00%) low severe 10 (10.00%) low mild 4 (4.00%) high mild 4 (4.00%) high severe

1 streams of 1000 bytes/multistream: :broken_heart: Performance has regressed.
       time:   [70.015 µs 70.572 µs 71.585 µs]
       change: [+5.9432% +6.8591% +8.6440%] (p = 0.00 Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
100 streams of 1000 bytes/multistream: Change within noise threshold.
       time:   [3.2498 ms 3.2571 ms 3.2651 ms]
       change: [+0.0659% +0.3759% +0.6923%] (p = 0.02 Found 21 outliers among 100 measurements (21.00%)
21 (21.00%) high severe
1000 streams of 1000 bytes/multistream: :green_heart: Performance has improved.
       time:   [137.86 ms 137.93 ms 138.01 ms]
       change: [-4.6993% -4.6211% -4.5483%] (p = 0.00 
coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [92.271 ns 92.583 ns 92.894 ns]
       change: [-0.3052% +0.0953% +0.4886%] (p = 0.64 > 0.05)

Found 11 outliers among 100 measurements (11.00%) 10 (10.00%) high mild 1 (1.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [109.78 ns 110.13 ns 110.51 ns]
       change: [-0.3723% -0.0072% +0.3742%] (p = 0.97 > 0.05)

Found 12 outliers among 100 measurements (12.00%) 2 (2.00%) low mild 10 (10.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [109.40 ns 109.73 ns 110.15 ns]
       change: [-0.6774% -0.0390% +0.6586%] (p = 0.91 > 0.05)

Found 11 outliers among 100 measurements (11.00%) 1 (1.00%) low severe 6 (6.00%) low mild 4 (4.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [91.943 ns 91.984 ns 92.032 ns]
       change: [-7.8172% -2.2377% +1.3701%] (p = 0.57 > 0.05)

Found 13 outliers among 100 measurements (13.00%) 3 (3.00%) high mild 10 (10.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [115.56 ms 115.61 ms 115.66 ms]
       change: [+0.1057% +0.1591% +0.2140%] (p = 0.00 Found 17 outliers among 100 measurements (17.00%)
7 (7.00%) low severe
10 (10.00%) high severe
SentPackets::take_ranges: No change in performance detected.
       time:   [5.2878 µs 5.4737 µs 5.6721 µs]
       change: [-5.0891% -1.4449% +2.0061%] (p = 0.46 > 0.05)

Found 7 outliers among 100 measurements (7.00%) 7 (7.00%) high mild

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [34.385 ms 34.452 ms 34.519 ms]
       change: [-2.2350% -1.9757% -1.7051%] (p = 0.00 
transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [34.559 ms 34.616 ms 34.674 ms]
       change: [-1.7091% -1.4692% -1.2245%] (p = 0.00 Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild
transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [34.500 ms 34.555 ms 34.610 ms]
       change: [-1.2164% -0.9952% -0.7644%] (p = 0.00 
transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [34.768 ms 34.817 ms 34.866 ms]
       change: [-0.7794% -0.5600% -0.3379%] (p = 0.00 
1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: No change in performance detected.
       time:   [2.2016 s 2.2088 s 2.2162 s]
       thrpt:  [45.123 MiB/s 45.272 MiB/s 45.421 MiB/s]
change:
       time:   [-0.3991% +0.1008% +0.6041%] (p = 0.69 > 0.05)
       thrpt:  [-0.6005% -0.1007% +0.4007%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: No change in performance detected.
       time:   [387.38 ms 389.24 ms 391.16 ms]
       thrpt:  [25.565 Kelem/s 25.691 Kelem/s 25.814 Kelem/s]
change:
       time:   [-0.2711% +0.4672% +1.2017%] (p = 0.22 > 0.05)
       thrpt:  [-1.1874% -0.4650% +0.2719%]

Found 3 outliers among 100 measurements (3.00%) 1 (1.00%) low mild 2 (2.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: :green_heart: Performance has improved.
       time:   [27.088 ms 27.784 ms 28.493 ms]
       thrpt:  [35.096  elem/s 35.993  elem/s 36.917  elem/s]
change:
       time:   [-10.788% -7.6617% -4.1286%] (p = 0.00 +8.2974% +12.093%]
1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: :green_heart: Performance has improved.
       time:   [3.1455 s 3.1632 s 3.1811 s]
       thrpt:  [31.436 MiB/s 31.614 MiB/s 31.792 MiB/s]
change:
       time:   [-12.180% -11.399% -10.625%] (p = 0.00 +12.865% +13.869%]

Client/server transfer results

Performance differences relative to 3e951d1ce564b55fea5470bc3f75d4a486fe24a5.

Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.

Client Server CC Pacing Mean ± σ Min Max Δ main Δ main
neqo neqo reno on 530.7 ± 69.4 457.2 718.0 23.4 1.1%
neqo neqo reno 577.0 ± 240.0 453.1 1631.3 41.0 1.8%
neqo neqo cubic on 528.6 ± 51.5 462.1 674.0 -4.0 -0.2%
neqo neqo cubic 525.8 ± 46.7 466.3 659.5 :broken_heart: 21.1 1.0%
google neqo reno on 913.3 ± 94.0 680.1 1043.5 15.0 0.4%
google neqo reno 908.8 ± 103.3 649.6 1084.9 -1.1 -0.0%
google neqo cubic on 903.0 ± 93.7 642.1 1063.2 12.9 0.4%
google neqo cubic 901.6 ± 104.7 658.1 1087.8 17.4 0.5%
google google 560.8 ± 46.8 528.9 737.1 8.5 0.4%
neqo msquic reno on 247.9 ± 77.9 202.0 613.8 12.5 1.3%
neqo msquic reno 234.7 ± 42.9 203.4 437.4 6.7 0.7%
neqo msquic cubic on 226.5 ± 24.7 200.4 305.4 5.2 0.6%
neqo msquic cubic 229.6 ± 33.9 202.7 373.1 -11.4 -1.2%
msquic msquic 120.3 ± 29.4 99.5 264.0 -0.7 -0.1%

:arrow_down: Download logs

github-actions[bot] avatar Mar 01 '25 02:03 github-actions[bot]

OK, this finally builds NSS. It's now dying on MTU not supporting android yet, which is my next fix.

larseggert avatar Mar 01 '25 05:03 larseggert

This is now waiting for https://github.com/mozilla/mtu/pull/96

larseggert avatar Mar 03 '25 13:03 larseggert

This is now at a point where the simulator finally executes the tests. There are many failures.

larseggert avatar Mar 13 '25 16:03 larseggert

@mxinden this is now ready for review, the tests pass (but check for FIXME). @martinthomson please take a look esp. at the change to neqo-crypto that allows overriding the NSS database directory? I don't think there is a security issue here, but YMMV.

larseggert avatar Mar 14 '25 07:03 larseggert

@mxinden take another look, I backed out the vvec changes. I can force merge if you think this is ready.

larseggert avatar Mar 14 '25 13:03 larseggert