Create separate GithubActions jobs for each test suite
Current runs on CI as split on per GHC basis. However, all test suites are being executed in parallel in the same run, which has some serious drawbacks:
- Output of test suites is interleaved, which can be very confusing
- There are thousands lines of output which makes it hard to find a failing test
- It is hard to infer which test suite the test failure is coming from
The ideal solution to this would be to execute each test suite as a parallel job that would be clearly visible in the UI. The acceptance requirements for this task are:
- All test suites should run in parallel
- All test suites should be executed and only after the execution is done the decision is made whether to pass or fail CI. Naturally, a failure of any test suite would result in the whole build failure
- Addition of a test suite should not go unnoticed. In other words CI should not be allowed to pass if there is a test suite in the repo that has not been executed. I believe
hie.yamlcould be leveraged somehow to enforce this, but there are might be other ways by usingcabalor something.
FYI, nix provides separate logs for each test suite, so now that we have hydra builds working again this could be considered a solution.
so now that we have hydra builds working again this could be considered a solution.
Nope, I do not want to rely on Hydra. The fact that it is working now does not give me any confidence that it will continue working into the future.
Moreover switching completely to Hydra is not a good idea at all, since then we are stuck with a nix only build setup without having a cabal only set up. Which is in fact important to test because:
- We need to make sure people are not required to use nix in order to use ledger. Especially considering that we have special C dependencies in the form of libsodium, blst and secp.
- nix caches test runs, which is not good for property tests. We want to run all test suites on every run, despite that it takes a bit longer.
I wasn't meaning to suggest that we switch over to using nix exclusively. I was just thinking that if we have a test failure, and the output is garbled in the GH build, we could always look at the nix logs as a backup. I think it would be rare that the tests would fail in GH and not on Hydra.
I'm happy to do the improvement to the GH builds, though, since you feel it's worth it.