chore(ci): run tarpaulin in release mode
This is a small experiment: it turns out compiling is relatively quick, on a PR I just did (3 minutes), but running the tarpaulin build took around 8 minutes. I'm wondering if increasing the compile times and lowering the run time is worth it, hence this experiment.
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 78.00%. Comparing base (
30f3a3c) to head (3082fc8). Report is 60 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #4139 +/- ##
==========================================
- Coverage 84.72% 78.00% -6.73%
==========================================
Files 269 268 -1
Lines 28779 24773 -4006
==========================================
- Hits 24384 19325 -5059
- Misses 4395 5448 +1053
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Results of the experiment:
- baseline:
- incremental build: ~3min30s
- run: ~7min
- compiling with O1:
- non-incremental build: 15min12s
- run: ~5min24s
- compiling with O2:
- non-incremental build: 29min19s
- run: ~5min30s
Since the build times are not fair comparisons (this PR could only have non-incremental build times, while for the baseline we get incremental compile times, I'm curious if others would like to enable the O1 level on main for a day or two, compare how it fares in terms of incremental compile times, and revert it if it's definitely not an improvement? cc @poljar
Results of the experiment:
* baseline: * incremental build: ~3min30s * run: ~7min * compiling with O1: * non-incremental build: 15min12s * run: ~5min24s * compiling with O2: * non-incremental build: 29min19s * run: ~5min30sSince the build times are not fair comparisons (this PR could only have non-incremental build times, while for the baseline we get incremental compile times, I'm curious if others would like to enable the O1 level on main for a day or two, compare how it fares in terms of incremental compile times, and revert it if it's definitely not an improvement? cc @poljar
Did you measure this locally or on CI? If we enable this on the CI how do we measure and compare this to our previous way, surely not via eyeballing?
Eyeballing the wall clock times reported in the log statements 👀 Even if not precise at the very second, these give us the right orders of magnitude IMO. What I'd really want is likely to get an idea of the cost of an incremental build, and that could be done locally if someone were very bored…
Ran some things during launch, to have a rough idea of how this would behave on my machine, in terms of compile times at least:
| Task | Wall-clock time | User-time |
|---|---|---|
| tarpaulin -O0 initial | 116 seconds | 15.8 minutes |
| tarpaulin -O0 rebuild | 20.23 seconds | 65.85 seconds |
| tarpaulin -O1 initial | 272 seconds | 72.44 minutes |
| tarpaulin -O1 rebuild | 73 seconds | 15.44 minutes |
So, while this doesn't tell us what kind of compile-time slowdown would be caused on CI (especially since my machine has lots of CPU cores, and I suspect CI doesn't), it tells us a rebuild might be between > ~3 and 14~ times slower than it is right now, and I suspect this would cancel the effects of having a faster runtime.
I'd still be inclined to try and merge this in production for a short while (a few days), rather than making back-of-the-envelope calculation like this, if others are interested. Otherwise I think it's ok to close this PR, there are other build-time optimizations to chase.
We chatted briefly about this, and we think it's an easy revert in case it's a regression. Let's find out :)
A NEW RACE CONDITION APPEARS!
thread 'tests::timeline::test_enabling_backups_retries_decryption' panicked at testing/matrix-sdk-integration-testing/src/tests/timeline.rs:375:10:
We should be able to send a message to our new room: UnknownError("EncryptionNotEnabled")
- The bug observed in the codecov task was an Heisenbug, since as soon as I added logging for debug, it disappeared.
- The regression in terms of coverage is because we start losing track of some lines that are effectively tested. Here's one example: https://app.codecov.io/gh/matrix-org/matrix-rust-sdk/pull/4139/blob/crates/matrix-sdk-base/src/rooms/normal.rs#L1420 It can't be that line 1421 be tested without the other lines now marked as missed being reached (we have tests that will exercise these code paths indirectly, e.g. here).
Because of (2), I think we should close this. It's possible that enabling O1 enables some extra inlining that confuses the codecov instrumentation, thus losing track of locations in the coverage. It's a bit of an unfortunate outcome, but at least we learned a thing :smiling_face_with_tear: