test icon indicating copy to clipboard operation
test copied to clipboard

how to express that a large number of tests are expected to fail?

Open devoncarew opened this issue 1 year ago • 8 comments

I'm currently working on a project where most of the tests for the project are code-gen'd from a specification. Many of the tests are currently failing - of ~7,590 tests, 545 are failing; as I fix things the pass percentage increases.

Because some tests currently fail, I'm not running tests on my CI (otherwise it's just a continuous red signal). This does mean that unfortunately I have effectively no regression coverage - I don't know when I break tests that had previously been passing.

One solution other systems w/ lots of tests have landed on is the notion of status files - encoding expected failures in a data file. The CI would then pass when the test results matched what was described in the status file.

Have you thought about how to support this use case - having a setup with lots of tests, where you can't reasonably expect them all to pass (but you still want CI + test coverage)? There are a few possible solutions I've thought of:

encode 'failure expected' in the test definition

This would be having a new 'bool fails' flag on the test() method. This would invert the normal test expectation. This technique would avoid the complexity of a new file format + aux. file. It has the drawback that people would need to modify the test definition (for me, in ~500 different source locations). It also wouldn't be able to express things like 'only expected to fail on windows'.

parameterize test()

  • have a new expectedStatus(String testId) closure param for test()
  • each package could decide how to figure out the test status (backed by a status file? a database?)
  • package:test doesn't need to add much new mechanics in terms of file formats and such
  • it pushes some of the complexity to the users of the feature (which is probably ok here)

support a status file

This is defining a status file format for package:test (or re-using dart_test.yaml?). You'd minimally want to be able to express that n tests that were expected to fail, and may want to push it as far as allowing different sections in the file (using package:boolean_selector?) each with separate test expectations for different platforms.

devoncarew avatar Dec 29 '22 01:12 devoncarew