kaocha --watch that only reruns tests affected by loaded changes

This is a code sketch of a feature I've been desiring in a test runner: when in watch mode, only rerun tests that have been affected by loaded changes

background

I wanted to do a proof-of-concept to prove it was generally possible (hence no tests yet :) ), and now I'm interested in discussing whether this is something kaocha folks would be interested in including. If so, does my general approach look okay (modulo the TODOs) or do I need to hook into different parts of the system?

I imagine that it could be a different CLI flag, aside from --watch given that it might have (incompleteness) bugs; something like --watch-with-deps-analysis (or something shorter if a good name arises)

behavior

current

Currently in kaocha, with --watch, when a file changes, it is reloaded, along with its corresponding *_test.clj friend. Then the entire test suite is re-run. Back when I worked at Nubank this could mean ~1-2 minutes of tests, and hence I never really used --watch. There is the nice case where if tests fail, upon next code reload, those test failures are run first, and only if they pass are the rest of the tests run, which helps with the re-run times of the entire test suite.

proposed

With the changes in this PR, with --watch (or whatever new flag we give the behavior), when the suite runs for the first time it analyzes all loaded namespaces using Clerk's form hash/dependency analysis and stores the results in an atom. Then when a file changes, it is reloaded, along with its corresponding *_test.clj friend. Right after reload, and before the tests are run, we re-analyze the changed namespaces using Clerk. If any forms change (via the calculated hash), we find all test forms that are dependents of that changed form and pass them back to kaocha's filtering logic to ensure only they are run.

For example, if I'm running kaocha in some random project like ordnungsamt

$ ./bin/kaocha unit --watch

[(........)]
2 tests, 8 assertions, 0 failures.

and I change only the compose-filters def, then only the test that uses compose-filters, either directly or indirectly, is re-run:

[watch] Reloading #{ordnungsamt.core ordnungsamt.core-test}
[watch] Running tests affected by changes #{ordnungsamt.core-test/test-compose-filters}
[(.....)]
1 tests, 5 assertions, 0 failures.

midje test runner

I used to use Midje a lot and got used to the way it did watching and test auto rerunning. It wouldn't re-run the entire test suite on reloads, but rather only tests for namespaces that had to be reloaded. Note that namespace reloading for midje differs from kaocha in that changing a namespace would trigger reloading all depending namespaces. In effect, the midje test runner was sort of a middle ground between kaocha's current watch that re-runs everything and the behavior proposed in this PR that has a form level, instead of ns level, deps analysis that allows one to only rerun individual tests that were affected by reload changes.

for example, with lein midje :autotest on a random repository, changing a central/widely used namespace triggers a lot of other ns reloads and tests to run:

======================================================================
Loading (matcher-combinators.core matcher-combinators.test-helpers matcher-combinators.matchers matcher-combinators.parser matcher-combinators.clj-test matcher-combinators.test matcher-combinators.matchers-test matcher-combinators.test-test matcher-combinators.parser-test matcher-combinators.midje matcher-combinators.midje-test matcher-combinators.standalone matcher-combinators.standalone-test matcher-combinators.core-test matcher-combinators.cljs-test)

>>> Midje summary:
All checks (184) succeeded.

>>> Output from clojure.test tests:

Ran 49 tests containing 136 assertions.
0 failures, 0 errors.
[Completed at 10:12:19]

but if you change a ns that doesn't isn't used by any other namespaces, it reloads and reruns fewer test

======================================================================
Loading (matcher-combinators.midje matcher-combinators.midje-test)
nil
All checks (117) succeeded.

side note

this PR depends on this unmerged commit to Clerk: https://github.com/nextjournal/clerk/commit/32ef18e12df6cfd2a38c13ccaa9d6fa932fa0fdb

Apr 06 '22 08:04 philomates

Thanks Phillip, this is great. Normally my first question is "can it be a plugin"? In this case it could, bit it probably does make sense to have it in core, maybe even on by default. We have this thing where pressing enter in watch mode runs the tests again, so if that could trigger a full run then that would be a really cool combo. Get fast iteration, then when you want a more thorough check you can ask for it.

I'd want to look at how much clerk as a dependency pulls in. Maybe the hashing could be released as its own lib, or we simply inline it.

Apr 06 '22 08:04 plexus

@philomates No rush; just wondering if you are planning to pick this up again. Thanks!

Nov 03 '22 04:11 alysbrooks

hmm, I probably won't dive into this again myself; I was curious about the possibilities but don't use the setup myself

Jan 11 '23 16:01 philomates

kaocha kaocha copied to clipboard

--watch that only reruns tests affected by loaded changes

background

behavior

current

proposed

midje test runner

side note

kaocha
kaocha copied to clipboard