cython-blis icon indicating copy to clipboard operation
cython-blis copied to clipboard

Baffling random failures in core Python behaviour in Windows CI

Open crusaderky opened this issue 2 weeks ago • 2 comments

I'm seeing very frequent (once every 2-3 runs) random failures in CI, which happen

  • exclusively on windows-latest runners, and
  • exclusively on Python 3.13 and 3.14 (not 3.14t).
  • I cannot yet say conclusively if they only happen in actions/setup-python or also inside actions/cibuildwheel.

The failures are as follows:

C:\hostedtoolcache\windows\Python\3.13.9\x64\Lib\site-packages\hypothesis\statistics.py:90: in describe_statistics
    runtime_ms = format_ms(t["runtime"] for t in cases)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

times = <generator object describe_statistics.<locals>.<genexpr> at 0x00000200EA90CAD0>

    def format_ms(times: Iterable[float]) -> str:
        """Format `times` into a string representing approximate milliseconds.
    
        `times` is a collection of durations in seconds.
        """
        ordered = sorted(times)
        n = len(ordered) - 1
        if n < 0 or any(math.isnan(t) for t in ordered):  # pragma: no cover
            return "NaN ms"
>       lower = int(ordered[math.floor(n * 0.05)] * 1000)
                            ^^^^^^^^^^^^^^^^^^^^
E       OverflowError: cannot convert float infinity to integer

C:\hostedtoolcache\windows\Python\3.13.9\x64\Lib\site-packages\hypothesis\statistics.py:58: OverflowError
============================ slowest 10 durations =============================
1.06s call     tests/test_gemm.py::test_threads_share_input
0.59s call     tests/test_dotv.py::test_threads_share_input
0.42s call     tests/test_dotv.py::test_memoryview_noconj
0.25s call     tests/test_gemm.py::test_memoryview_notrans

times is a collection of runtimes for a test and I expect it to be fairly short - anywhere between 0 and a few thousands elements. Note the underline, which is saying that n * 0.05 is valued inf.

Either:

  • list.__len__ returned float inf instead of the expected (fairly small) integer, or
  • a small integer * 0.05 returned inf.

Both of the above are absurd and I have no explanation.

crusaderky avatar Dec 08 '25 23:12 crusaderky

@liam-devoe have you ever seen anything like this? Somehow in hypothesis internals, pure-python expressions aren't evaluating to the correct result.

ngoldbaum avatar Dec 09 '25 17:12 ngoldbaum

Example: https://github.com/explosion/cython-blis/actions/runs/20043099915

crusaderky avatar Dec 09 '25 17:12 crusaderky

This is completely cursed and I have have no explanation beyond the two hypotheses you've posed 😄. I've never seen something like this before

Liam-DeVoe avatar Dec 14 '25 03:12 Liam-DeVoe