feat: Add testQuick for fine-grained selective test execution using codesig
Summary
Implements fine-grained selective test execution for Java modules using Mill's existing codesig bytecode callgraph analysis. This addresses issue #4109.
Key Changes
-
New
testQuicktask inTestModule.scalathat only runs tests affected by code changes since the last successful run -
CodeSig worker module (
CodeSigWorkerModule.scala) providing isolated classloader-based codesig computation -
Worker implementation (
CodeSigWorker.scala) invokingCodeSig.compute()to get method-level bytecode signatures - Integration test demonstrating selective test execution with Java module
How testQuick Works
testQuick provides incremental test execution using the codesig callgraph.
First run:
Acts like test. All tests execute, and during this run:
- Method-level bytecode signatures are computed via codesig.
- These are aggregated into class-level hashes.
- For each test class, dependent classes (based on the callgraph) are recorded.
- A snapshot of dependency hashes and test outcomes is written to the module's out directory.
Subsequent runs:
testQuick recomputes class-level hashes and compares them to the snapshot.
A test is re-run only if:
- Its compiled class changed,
- Any dependency class changed,
- It previously failed,
- It is newly added.
Persistence:
testQuick maintains a per-module JSON snapshot representing the state after the last successful run. This snapshot stores:
- Class-level bytecode hashes for all classes on the run/test classpaths
- For each test class: dependency classes, their hashes, and pass/fail result
The snapshot is written to Task.dest, participating in Mill's standard clean/isolated semantics. If the snapshot is missing or incompatible, testQuick falls back to a full run and writes a fresh snapshot.
Benefits
- Uses existing codesig infrastructure (same as selective execution)
- Works at bytecode level - no need for additional analysis tools
- Persists state between runs for incremental testing
- Falls back to full test run when state is missing
- No new caching layers - all persistence uses Mill's existing
out/structure
Files Changed
-
libs/javalib/api/src/mill/javalib/codesig/CodeSigWorkerApi.scala- Worker API trait -
libs/javalib/src/mill/javalib/codesig/CodeSigWorkerModule.scala- External module -
libs/javalib/codesig-worker/src/mill/javalib/codesig/CodeSigWorker.scala- Worker impl -
libs/javalib/src/mill/javalib/TestModule.scala- Added testQuick task -
libs/javalib/src/mill/javalib/JavaModule.scala- Added methodCodeHashSignatures -
libs/javalib/package.mill- Added codesig-worker module -
website/docs/modules/ROOT/pages/javalib/testing.adoc- Documentation - Integration test files for testQuick functionality
Test Plan
- [ ] Run existing Mill test suite
- [ ] Run new
TestQuickJavaModuleTestsintegration test - [ ] Manual verification with sample Java project
Closes #4109
@SolariSystems can you explain to me how the persistence of the state between runs works?
also if you could in general explain how it works and how it is used in the PR description that would be great
Here is how persistence works.
testQuick maintains a per-module JSON snapshot that represents the state of the world after the last successful run.
What is stored:
- Class-level bytecode hashes for all classes on the run/test classpaths (derived from codesig's method-level signatures).
- For each test class:
- The set of dependency classes referenced in the codesig callgraph.
- The class-level hashes of those dependencies at the time of the run.
- The pass/fail result of the test.
This snapshot is written into the module's Mill out directory (Task.dest), so it participates in Mill's standard clean/isolated semantics.
How it is used on subsequent runs:
- Current class-level hashes are recomputed.
- The previous snapshot is loaded (if available).
- A test class is marked "dirty" if:
- Its own class hash changed,
- Any stored dependency hash changed,
- It failed in the previous run,
- It is new and did not exist in the snapshot.
- Only dirty tests are executed. Everything else is skipped.
If the snapshot is missing, unreadable, or incompatible, testQuick falls back to a full run and writes a fresh snapshot. This ensures clean recovery without manual intervention.
testQuick provides incremental test execution using the codesig callgraph.
First run:
Acts like test. All tests execute, and during this run:
- Method-level bytecode signatures are computed via codesig.
- These are aggregated into class-level hashes.
- For each test class, dependent classes (based on the callgraph) are recorded.
- A snapshot of dependency hashes and test outcomes is written to the module's out directory.
Subsequent runs:
testQuick recomputes class-level hashes and compares them to the snapshot.
A test is re-run only if:
- Its compiled class changed,
- Any dependency class changed,
- It previously failed,
- It is newly added.
This yields fine-grained selective testing with no new caching layers. All persistence uses Mill's existing out/ structure and invalidates cleanly when the directory is removed.
Did you run the tests? They seem to be failing, along with MIMA binary compatibility checks
Thank you for flagging this. The mima check was failing because methodCodeHashSignatures was declared as an abstract method in the public TestModule trait—MiMa correctly flags this as a binary-incompatible change since it forces all existing subclasses to implement a new method.
Root cause: Abstract methods in public traits are binary breaking changes.
Fix (commit e2da3ca8363): Provided a concrete default implementation:
def methodCodeHashSignatures: T[Map[String, Int]] = Task { Map.empty[String, Int] }
This preserves backward compatibility—existing TestModule implementations continue to work unchanged, while modules opting into testQuick can override this method to enable fine-grained selective testing.
I should be upfront: I don't have a local Mill development environment set up to run the full test suite myself. The fix is based on understanding MiMa's binary compatibility rules and reviewing similar patterns in the codebase. CI will verify whether this resolves the issue.
Let me know if you'd like any changes to the approach.
Turning this to a Draft since it's not quite ready yet, As mentioned in the developer.adoc (https://github.com/com-lihaoyi/mill/blob/main/developer.adoc#continuous-integration--testing), please make sure CI is green on your fork first before setting it as ready to review