rules_java icon indicating copy to clipboard operation
rules_java copied to clipboard

Improve performance when running a large number of JUnit tests

Open swarren12 opened this issue 4 months ago • 0 comments

Title is a bit vague here, apologies for that.

Out-of-the-box, it is only possible to provide a single entry point to java_test, e.g.

java_test(
    name = "com.example.MyLovelyUnitTest",
    test_class =  "com.example.MyLovelyUnitTest",
    srcs = [ ... ],
    # etc etc
)

If one wishes to make a target from multiple classes, there are currently two well publicised workarounds:

  1. Use a macro wrapper;
  2. Use a @Suite or similar.

Unfortunately, each of these comes with a negative side effect wrt. performance:

  1. Using a macro wrapper to turn the glob of source files into distinct java_test targets is documented to have a significant performance impact due to having to create and tear-down workers. To give a rough idea of the impact of this, here are some numbers taken from a Bazel project with ~5000 unit tests using a @Suite:
$ bazel clean
$ time bazel test //... --build_tests_only --test_lang_filters=java --test_size_filters=small
...
Executed 498 out of 498 tests: 498 tests pass.
...
bazel test //... --build_tests_only --test_lang_filters=java   2.58s user 1.55s system 0% cpu 16:41.63 total

... vs using a macro wrapper + aggregate test_suite:

$ bazel clean
$ time bazel test //... --build_tests_only --test_lang_filters=java  --test_size_filters=small
...
Executed 4691 out of 4691 tests: 4691 tests pass.
...
./bb bazel bz test //... --build_tests_only --test_lang_filters=java   2.72s user 1.69s system 0% cpu 26:47.95 total

Both of these runs were operating over the same set of tests, but using a separate java_test for each individual class causes the build time to increase by ~60%.

  1. Generating a @Suite (either at compile time or dynamically via something like AllTests) clashes with --flaky_test_attempts as, if any test case fails, the entire suite is detected as having failed and so all tests are run again. This can be somewhat mitigated by sharding but there's a cap of 50 on the number of shards.

Option 1 is "okay" in cases where there are few, long-running tests; option 2 is "okay" for a lot of fast running tests. It would be nice to have a "one-size-fits-all" solution.

There has been an open issue in the main Bazel repository for a few years now that has a bit of overlap, but that one seems a bit more focused around convenience rather than performance. As the Java rules are being broken out, I thought it might make sense to move it over here for an updated discussion.

When I came across the original issue, I did a very quick-and-hacky PoC of how the built-in Bazel test runner could be updated to support multiple classes; however, on revisiting this I'm not sure if that would actually solve the performance problem on its own, as it looks as though the flaky test attempts is handled outside the test runner process, and so I'm guessing this solution would end up just working in the same way as a @Suite.

swarren12 avatar Feb 13 '24 08:02 swarren12