jj icon indicating copy to clipboard operation
jj copied to clipboard

compile integration tests as a single binary

Open jyn514 opened this issue 1 year ago • 8 comments

closes https://github.com/martinvonz/jj/issues/2912

Checklist

If applicable:

  • [x] I have updated CHANGELOG.md
  • [x] I have updated the documentation (README.md, docs/, demos/)
  • [x] I have updated the config schema (cli/src/config-schema.json)
  • [x] I have added tests to cover my changes

jyn514 avatar Feb 02 '24 08:02 jyn514

Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.

arxanas avatar Feb 03 '24 02:02 arxanas

The benchmark in the commit message is touch lib/src/lib.rs ; cargo t --test runner --no-run, and I'm assuming hyperfine throws out the first run like a good benchmarking tool should. So, it's all about incremental compilation, I think.

ilyagr avatar Feb 03 '24 02:02 ilyagr

I also unscientifically tried touch lib/src/lib.rs ; cargo insta test --test-runner nextest -- test_abandon. It feels fast with this PR, though which linker you use might again make a difference.

I found another minor downside: the fish shell has completions for cargo test --test <TAB>, but not for cargo test -- <TAB> (or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).

ilyagr avatar Feb 03 '24 03:02 ilyagr

Regarding touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry that cargo might do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.

arxanas avatar Feb 03 '24 03:02 arxanas

Regarding touch, @jyn514 would probably know better than me, but in my blog post I actually modified the file in the worry that cargo might do something content-addressed as part of compilation and it wouldn't represent a "real" incremental change.

cargo does not do content-addressed hashing, no. you can see this if you set CARGO_LOG=info:

; touch lib/src/lib.rs ; CARGO_LOG=info cargo t --no-run
   0.089419238s  INFO cargo::core::compiler::fingerprint: stale: changed "/home/jyn/src/jj/lib/src/lib.rs"
   0.089482460s  INFO cargo::core::compiler::fingerprint:           (vs) "/home/jyn/.local/lib/cargo/target/debug/.fingerprint/jj-lib-49a46e34c360fc2e/dep-lib-jj-lib"
   0.089485415s  INFO cargo::core::compiler::fingerprint:                FileTime { seconds: 1706861032, nanos: 988871571 } < FileTime { seconds: 1706931264, nanos: 137617386 }

Can you compare the resulting runtimes for running the whole test suite and running a single test? The improved compilation times are admirable, but if it turns out that the full test suite :: 5m -> 4m but incremental tests :: 2s -> 4s then it might not be worth it. Ideally I would have incremental rebuild+test times be <1s. When developing jj, I extensively relied on re-running individual tests repeatedly, so that feedback loop is very important to me. Others might feel differently about those trade-offs.

sure; but the times are small enough i don't think times on my computer will be comparable to yours, i'd recommend running this on your own machine to see what the times are.

before this change:

; hyperfine 'touch lib/src/lib.rs && cargo t --test test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test test_working_copy
  Time (mean ± σ):      3.543 s ±  0.168 s    [User: 2.597 s, System: 1.262 s]
  Range (min … max):    3.400 s …  3.847 s    10 runs

after this change:

; hyperfine 'touch lib/src/lib.rs && cargo t --test runner -- test_working_copy'
Benchmark 1: touch lib/src/lib.rs && cargo t --test runner -- test_working_copy
  Time (mean ± σ):      4.129 s ±  0.120 s    [User: 3.636 s, System: 1.593 s]
  Range (min … max):    3.933 s …  4.346 s    10 runs

so only about half a second of regression, from 3.5 to 4 seconds.

(using cargo test is intentional, nextest is actually a bit slower than the built-in framework for single tests)

jyn514 avatar Feb 03 '24 03:02 jyn514

I found another minor downside: the fish shell has completions for cargo test --test <TAB>, but not for cargo test -- <TAB> (or they don't work with this PR). In this way, doing a subset of tests is less convenient for me with this PR (though I think the speedup is worth it for me).

yeah, this is unfortunate. i don't know a way to fix this upstream in fish; it would have to parse the mod declarations (which is not super hard but requires it to be aware of cargo's directory structure instead of just running cargo metadata).

jyn514 avatar Feb 03 '24 03:02 jyn514

Can you compare the resulting runtimes for running the whole test suite

it takes about 13.5 seconds for me to run the tests with cargo nextest run after they're compiled, both before and after this PR. so the total time is going from ~45 to ~18 seconds (and it's better than 18 in practice because it's rare for the very last test run to be the only one that fails).

jyn514 avatar Feb 03 '24 03:02 jyn514

Thanks for the PR, @jyn514, and thanks for reviewing and testing to everyone else. @arxanas, are you okay with this? It sounds like we're just waiting for your approval, but no rush.

martinvonz avatar Feb 03 '24 06:02 martinvonz

I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?

P.S. The tests are failing because a few new test files were added in the last week.

ilyagr avatar Feb 07 '24 01:02 ilyagr

I think Martin will cut a release in the next couple of days. I'm likely being overly cautious, but perhaps we can delay merging this until the release, just in case?

It doesn't seem that this can break the binaries, so what's your concern? If it breaks the CI actions, I'd rather learn that tomorrow than in a month.

martinvonz avatar Feb 07 '24 01:02 martinvonz

OK, this is definitely up to Martin. (I was vaguely worried about the CI) Please feel free to merge, and thank you again!

ilyagr avatar Feb 07 '24 02:02 ilyagr