kaocha icon indicating copy to clipboard operation
kaocha copied to clipboard

Feature: Test suite partitioning

Open frenchy64 opened this issue 1 year ago • 8 comments

CI platforms like Actions and CircleCI support matrix builds which can be used to fan-out a number of parallel jobs executing a test suite.

For this to result in faster builds, the test runner must be able to partition a test suite.

An Actions build might look like this:

jobs:
  test:
    runs-on: ubuntu-22.04
    strategy:
      matrix:
        id: [0,1,2,3,4]
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

This would cover the entire test suite by running:

./bin/kaocha --partition-index 0 --partitions 5
./bin/kaocha --partition-index 1 --partitions 5
./bin/kaocha --partition-index 2 --partitions 5
./bin/kaocha --partition-index 3 --partitions 5
./bin/kaocha --partition-index 4 --partitions 5

You could imagine different strategies for partitioning:

  • split by test namespace
    • don't need to load tests you don't need
  • load all namespaces, split by deftest
    • share fixtures?
  • use timing results from prior runs to load-balance tests
  • have Kaocha inform CI how many partitions are needed in order to build in a certain timeframe, e.g.,
jobs:
  setup:
    runs-on: ubuntu-22.04
    outputs:
      partitions: ${{steps.partitions.outputs.partitions}}
    steps:
      - uses: actions/cache/restore@v4
        with:
          path: timings.edn
      - id: partitions
        run: echo "partitions=$(./bin/kaocha --print-partitions --target-time 5m --prior-timings timing.edn | bb -e '(-> *input* range json/encode println)')" >> GITHUB_OUTPUTS

  test:
    runs-on: ubuntu-22.04
    needs: setup
    strategy:
      matrix:
        id: ${{ fromJSON(needs.setup.outputs.partitions)}}
    steps:
      - run: ./bin/kaocha --partition-index ${{ strategy.job-index }} --partitions ${{ strategy.job-total }}

The partitioning algorithm must be deterministic and reproducible, with every test being run. It should assume that each --partition-index is covered, which is the user's responsibility (or could be packaged in a reusable Action). The simplest algorithm might be to sort tests by name before partitioning. Test runs could be randomized by using the current git sha as a seed.

frenchy64 avatar Nov 17 '24 18:11 frenchy64

I started looking into this and maybe this is just a plugin or a hook? Guidance welcome.

frenchy64 avatar Nov 18 '24 19:11 frenchy64

Yeah sure, this would be fairly easy to do both as a plugin or a hook. Seems general purpose enough that it would be nice to make a plugin for it.

On Mon, Nov 18, 2024, 20:07 Ambrose Bonnaire-Sergeant < @.***> wrote:

I started looking into this and maybe this is just a plugin or a hook? Guidance welcome.

— Reply to this email directly, view it on GitHub https://github.com/lambdaisland/kaocha/issues/449#issuecomment-2483884586, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAH3VDCERJVUZFSPIHL3CL2BI3GTAVCNFSM6AAAAABR6F5QCSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIOBTHA4DINJYGY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

plexus avatar Nov 18 '24 19:11 plexus

Perhaps it could be an extra feature on kaocha.plugin/filter? I think I need to move kaocha.plugin/randomize to pre-run for determinism. I'll post a proof-of-concept if I figure it out.

frenchy64 avatar Nov 18 '24 19:11 frenchy64

PoC implementing partitioning by deftest and also partitioning by suite (less useful): https://github.com/frenchy64/kaocha/pull/5

Randomize plugin cannot go before the filter plugin, unless a fixed --seed is provided. I swapped the default order of the plugins in the PoC which would break reproducible builds for specific seeds. Perhaps a workaround might be to swap the order of the plugins if any of the new partitioning args are provided.

frenchy64 avatar Nov 18 '24 22:11 frenchy64

Changed to a less disruptive approach of re-sorting the test plan if it's already been randomized and printing a warning.

frenchy64 avatar Nov 19 '24 00:11 frenchy64

Added an example Action workflow that automatically scales based on a target build time. I eventually want to provide a reusable workflow that makes this a one-liner, but it needs a couple of features from kaocha:

  • ability to save profiling results to a file
  • ability to process previous timing results and suggest how many partitions to use

And a bunch of other details. The Actions-specific stuff is in the workflow.

I sort of crammed everything the first place it seemed appropriate. This really blurs the lines between some plugins. I was hesitant to create a new plugin because of how critical it is for the user to get the plugin ordering correct.

Here's an example build that automatically partitions kaocha's own unit tests to achieve a target running time of 10 minutes using up to 10 partitions: https://github.com/frenchy64/kaocha/actions/runs/11924460226

If anyone can make sense of the sketch I'd love some feedback.

frenchy64 avatar Nov 20 '24 00:11 frenchy64

Hey, finally had a chance to look at this... just skimmed it to get a first impression. I really appreciate the added tests, including the gherkin tests.

I would much prefer for this to be an isolated plugin. Kaocha's philosophy is to have a relatively lean core but provide affordances to do more advanced stuff. In particular seeing so much of this go into the filter plugin seems wrong. Filter has a narrow job of dealing with focus/skip flags and metadata.

When a plugin gets added we call the register multimethod, and it's the plugins own responsibility to add itself to the stack. If there are ordering constraints they can be checked there. Or if we want to get fancy we can let plugins declare dependencies.

That all said if it's really too hard to do as an isolated plugin, or if you don't have the heart to rework the whole things, then I'm also ok with merging it as is. It's a very useful feature, I'd rather have it go in as it is than not have it go in at all.

plexus avatar Dec 11 '24 09:12 plexus

@plexus thanks for the feedback. If we can constrain a partitioning plugin to always be after filtering, I think it should work. I'll give it a shot.

frenchy64 avatar Dec 13 '24 05:12 frenchy64