jj
jj copied to clipboard
FR: make tests faster to recompile
Is your feature request related to a problem? Please describe. Recompiling jj's tests after a change is slow; there are 81 separate binaries that have to be linked and compiled. I have a fairly fast machine (Ryzen 7700x), but even here it takes about 50 seconds to recompile, longer than actually running the tests.
Describe the solution you'd like
Use a shared integration test binary, and mod test_abandon_command
etc for loading separate tests. That only has to link two binaries (one for lib/
, one for cli/
), and should greatly reduce compile times.
Describe alternatives you've considered
- Don't make any changes
- Switch from integration tests to unit tests. This might be possible but limits jj to tests that can be run in parallel, and is likely annoying to implement.
Additional context HEAD is 01825953aca5b9aab6d7e9db7777818482f99914; i've made local changes but none that should affect compile times.
Re your actual suggestions: I'm not an expert, so I don't have many comments. You can try these changes and see if it speeds things up. If it does, many people including myself will benefit :) Perhaps somebody who understands Rust compilation better will also chime in.
Some more general thoughts:
I've had compilation take even longer, it is not fun. Rust likes to eat CPU.
You can try the general suggestions, if you haven't already.
-
Make sure you're using
nextest
, though this might speed up test execution more than the compilation. -
If you are on Linux, you should try using
mold
, especially if you think that the linker using more cores would speed things up.
If you're on Windows, malware-protecting services (e.g. Windows Defender) could slow things up a lot. On a Mac, I had problems with that, and I'm guessing Windows is similar.
right, nexttest can't help with test compilation because it still has to run cargo build
to generate the test harnesses.
i'm on linux, i can try mold.
out of curiosity, how long do the tests take for you to recompile? is it about a minute as well?
First try (recompiled a bunch of dependencies too, but not all): 32.55 s
After deleting target/
: 38.4s.
timings.zip
Before I disabled the single-threaded malware detector, it was closer to 50s.
This is on a 12-core Apple M2 Mac. Macs use a multi-threaded linker similar to mold
AFAIK.
But re your original idea: it'd be really cool if it helps! :)
with mold it seems to take ... almost twice as long??? github isn't letting me upload the timings for some reason but here they are rendered: https://jyn514.github.io/assets/cargo-timing-20240202T071243Z.html
Bizarre. No idea why that would happen, and I'm not an expert on mold either. For the timings, you could zip them to make GitHub happy (it filters by extensions), but I like your way of sharing the html better.
oh lol i was using a version of mold from several years ago
with a new version it works faster than lld, but not that much faster, it's still about a second per test. https://jyn.dev/assets/cargo-timing-20240202T072514Z.html
i'll try combining them into a single integration test.
Just fyi, there's a previous attempt: #2139
Another workaround is to move cli tests to an empty crate. Since these tests just run the jj
executable, it works. The problem is that we can't express a binary dependency, so there's no way to ensure jj
is recompiled.
In the long run, this might not be what we want, particularly if we break jj into separate crates for subcommands (see https://blog.waleedkhan.name/rust-incremental-test-times/). Generally, I do iterative work by running only one or two test modules, and only run all of them later (or let CI do it), and I think putting all the integration tests into the same crate pessimizes that workflow.
@arxanas hmm, have you benchmarked the difference? if i checkout #2139 and run touch lib/src/lib.rs ; c t --no-run
with a fresh cache, it takes only 3.5 seconds to recompile all tests. the previous commit takes 11 seconds for all tests, and seven seconds for touch lib/src/lib.rs ; c t --no-run --test test_working_copy_sparse
(because it has to rebuild dependencies as well; with those cached it takes 1.4 seconds, which is better than 3.5 but ... not by that much).
i can try this again with a modern branch of course, a lot of tests have been added since then, but i suspect the time per test will be roughly the same.
btw, re
The cargo build timings for the incremental build doesn’t help. It just shows that the test I’m building takes 100% of the spent time.
the flag you want is cargo rustc -- -Ztime-passes
with a new version it works faster than lld, but not that much faster,
If people keep having this experience, we could remove the mold
recommendation from the contributing docs.
I had one really good experience where mold sped everything up by 2x (for ARM Linux running in QEMU on my Mac), and one experience with AMD where mold
didn't speed things up at all.
i can try this again with a modern branch of course, a lot of tests have been added since then, but i suspect the time per test will be roughly the same.
i've done this; it's now ~4 seconds for all tests
; hyperfine 'touch lib/src/lib.rs ; cargo t --test runner --no-run'
Time (mean ± σ): 4.055 s ± 0.123 s [User: 3.591 s, System: 1.593 s]
Range (min … max): 3.804 s … 4.159 s 10 runs
compared to 32-50 seconds for all tests and 2 seconds for a single test before
Benchmark 1: touch lib/src/lib.rs ; cargo t --no-run --test test_local_working_copy_concurrent
Time (mean ± σ): 2.106 s ± 0.072 s [User: 1.361 s, System: 0.653 s]
Range (min … max): 2.020 s … 2.257 s 10 runs
with a new version it works faster than lld, but not that much faster,
If people keep having this experience, we could remove the
mold
recommendation from the contributing docs.
iirc, the default on Linux is GNU ld. clang lld
is much faster, and mold
is somewhat faster than lld
.
fwiw, autotests = false
will help if we want to combine all tests without reorganizing directory structure.
https://doc.rust-lang.org/cargo/reference/cargo-targets.html#target-auto-discovery
yup, i'm using that :) and adding a test to make sure people don't forget the mod foo;
declaration
iirc, the default on Linux is GNU ld. clang
lld
is much faster, andmold
is somewhat faster thanlld
.
FWIW it's codebase dependent but in many cases for real Rust codebases, mold
can be up to 10x faster, e.g. I've had it take 30 second link times (which are nearly unbearable) to less than 3 seconds. I suspect it'll always be somewhere in this range ("a little to an unbelievable amount") in terms of improved perf.
The reality is that we don't have many crates and dependencies right now, so while our iteration time is somewhat slower, it puts a bit less pressure on the linker since there's not as much work to do. Our compile times aren't too long, so only having two crates isn't that bad. But realistically as jj
gets bigger, the compilation step will begin to take longer and longer, and we will probably need to start splitting things out more into crates due to that being the way to achieve parallel compilation; that is when mold
will begin to shine even more.
I responded to this effect at https://github.com/martinvonz/jj/pull/2914, but I don't think the test compilation time by itself is what we should be optimizing for, but rather the total test runtime for the full-test and incremental-test cases.