clifford icon indicating copy to clipboard operation
clifford copied to clipboard

Stop CI failing due to irrelevant issues

Open hugohadfield opened this issue 4 years ago • 2 comments

Currently the CI fails fairly regularly due to irrelevant accuracy problems on some tests.

My current thoughts on this is that it is largely down to the fact we almost always compare absolute difference between arrays rather than relative differences. I propose changing almost all tests to np assert allclose rather than np assert almost equal to make use of the relative tolerance parameter.

More generally though it would be good to brainstorm what to do to make the tests better. Currently they mostly work by randomly trying a large number of possible inputs to the functions and checking that they get expected answers. They are unseeded and so fail differently each time. My questions are:

  • Should we be seeding the tests?
  • Should we change more tests to Xfail if they break regularly?
  • For the tests that fail regularly should we provide a list of cases that we know definitely work and identify ones that don't to ensure no regressions?
  • Broadly, how can we make our tests more useful?
  • @eric-wieser has previous mentioned maybe using the hypothesis package (https://hypothesis.readthedocs.io/en/latest/), would it be worth using this to replace our current random testing approach?

hugohadfield avatar Apr 16 '20 08:04 hugohadfield

Briefly: Probably, maybe, seems involved, not sure, likely yes

eric-wieser avatar Apr 16 '20 08:04 eric-wieser

In my experience, slowly reducing the expected accuracy of tests fail regularly is good way to make the test suite more robust over time. Perhaps it would be nice to mark tests as 'unreliable' or 'flakey' even just via the comments. In conclusion, for the tests to be useful they need to be reasonable reliable to make a statement about the current health of the project. And yes, I think using Hypothesis with all of it's strategies would be a great way to reduce overhead and increase the coverage.

esc avatar May 11 '20 11:05 esc