0.5: random test failures due to unreliable test timings

Open olebole opened this issue 6 years ago • 1 comments

I am trying to update astroplan to the latest version 0.5 on Debian. When running the tests on the different platforms, I get random failures like

>           raise Flaky(message)
E           hypothesis.errors.Flaky: Hypothesis test_boundaries(nside_pow=0, frac=0.4911781148020645, step=1, nest=True) produces unreliable results: Falsified on the first call but did not on a subsequent one

/usr/lib/python3/dist-packages/hypothesis/core.py:751: Flaky
---------------------------------- Hypothesis ----------------------------------
Falsifying example: test_boundaries(nside_pow=0, frac=0.4911781148020645, step=1, nest=True)
Unreliable test timings! On an initial run, this test took 376.64ms, which exceeded the deadline of 200.00ms, but on a subsequent run it took 5.95 ms, which did not. If you expect this sort of variability in your test timings, consider turning deadlines off for this test by setting deadline=None.

on one or the other place. This happened so far for MIPS 32/64 bit and ARM 64 bit. I would guess that the unreliable timing comes from the hardware and the load on our test machines -- especially the MIPS machines are rather slow. And I don't really see why this shall cause a test failure. As suggested by the error message above, I could disable this by setting deadline=None; however as far as I understand the "hypothesis" package, this has to be done individually for each test, which would make a Debian specific patch rather unmaintainable. Is there a way to switch this off globally, and would you consider doing this in the upstream package? Or do I misunderstand something here? Cc: @lpsinger as the Debian package maintainer

Dec 07 '19 13:12 olebole

@olebole Does this issue persist with the latest version of astroplan?

Feb 01 '21 06:02 bmorris3