jj
jj copied to clipboard
compile integration tests as a single binary
closes https://github.com/martinvonz/jj/issues/2912
Checklist
If applicable:
- [x] I have updated
CHANGELOG.md - [x] I have updated the documentation (README.md, docs/, demos/)
- [x] I have updated the config schema (cli/src/config-schema.json)
- [x] I have added tests to cover my changes
Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.
The benchmark in the commit message is touch lib/src/lib.rs ; cargo t --test runner --no-run, and I'm assuming hyperfine throws out the first run like a good benchmarking tool should. So, it's all about incremental compilation, I think.
I also unscientifically tried touch lib/src/lib.rs ; cargo insta test --test-runner nextest -- test_abandon. It feels fast with this PR, though which linker you use might again make a difference.
I found another minor downside: the fish shell has completions for cargo test --test <TAB>, but not for cargo test -- <TAB> (or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).
Regarding touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry that cargo might do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.
Regarding
touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry thatcargomight do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.
cargo does not do content-addressed hashing, no. you can see this if you set CARGO_LOG=info:
; touch lib/src/lib.rs ; CARGO_LOG=info cargo t --no-run
0.089419238s INFO cargo::core::compiler::fingerprint: stale: changed "/home/jyn/src/jj/lib/src/lib.rs"
0.089482460s INFO cargo::core::compiler::fingerprint: (vs) "/home/jyn/.local/lib/cargo/target/debug/.fingerprint/jj-lib-49a46e34c360fc2e/dep-lib-jj-lib"
0.089485415s INFO cargo::core::compiler::fingerprint: FileTime { seconds: 1706861032, nanos: 988871571 } < FileTime { seconds: 1706931264, nanos: 137617386 }
Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.
sure; but the times are small enough i don't think times on my computer will be comparable to yours, i'd recommend running this on your own machine to see what the times are.
before this change:
; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test test_working_copy
Time (mean ± σ): 3.543 s ± 0.168 s [User: 2.597 s, System: 1.262 s]
Range (min … max): 3.400 s … 3.847 s 10 runs
after this change:
; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test runner -- test_working_copy
Time (mean ± σ): 4.129 s ± 0.120 s [User: 3.636 s, System: 1.593 s]
Range (min … max): 3.933 s … 4.346 s 10 runs
so only about half a second of regression, from 3.5 to 4 seconds.
(using cargo test is intentional, nextest is actually a bit slower than the built-in framework for single tests)
I found another minor downside: the fish shell has completions for
cargo test --test <TAB>, but not forcargo test -- <TAB>(or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).
yeah, this is unfortunate. i don't know a way to fix this upstream in fish; it would have to parse the mod declarations (which is not super hard but requires it to be aware of cargo's directory structure instead of just running cargo metadata).
Can you compare the resulting runtimes for running the whole test suite
it takes about 13.5 seconds for me to run the tests with cargo nextest run after they're compiled, both before and after this PR. so the total time is going from ~45 to ~18 seconds (and it's better than 18 in practice because it's rare for the very last test run to be the only one that fails).
Thanks for the PR, @jyn514, and thanks for reviewing and testing to everyone else. @arxanas, are you okay with this? It sounds like we're just waiting for your approval, but no rush.
I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?
P.S. The tests are failing because a few new test files were added in the last week.
I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?
It doesn't seem that this can break the binaries, so what's your concern? If it breaks the CI actions, I'd rather learn that tomorrow than in a month.
OK, this is definitely up to Martin. (I was vaguely worried about the CI) Please feel free to merge, and thank you again!