MemoryError since 6.128.3 on multiple platforms and Python versions
Hi, I have a test suite using hypothesis which has been stable for a couple of years, although since 6.128.3 hypothesis seems to be causing a Memory Error. This GitHub Actions report shows the suite failing across platforms and Python versions (3.10 and 3.12) (the Windows runs show the issue explicitly as a MemoryError). I've since pinned the dependency to 6.128.2 and the suite passes fine (here).
Inspecting the logs of the failing run shows that it's not failing on a specific test, rather different platforms are failing on different tests, although they are all failing 'about the same time' into the test run.
Running the test suite locally results in either VSCode or the OS crashing unceremoniously. However, all tests can be run individually without any issue (all passing), suggesting that the issue stems from something that's going on at a session level in the background(?).
Any ideas? I see 6.128.3 introduced changes concerning recursive draws - possibly not a coincidence?
Thanks for this tremendous package! I'm hoping that whatever's at root of this issue will become quickly apparent to those involved with 6.128.3... 🤞
My best guess is that we did something (in https://github.com/HypothesisWorks/hypothesis/pull/4295) which happened to take you from "fits in memory on that worker" to "doesn't". And that leads me to suspect that you're hitting the performance issue that we just fixed in https://github.com/HypothesisWorks/hypothesis/pull/4349. (problem, solution - if I'm right)
So... try the latest version of Hypothesis again?
Seems like it's still failing with that fix in place: https://github.com/maread99/market_prices/actions/runs/14345199458/job/40213489887?pr=427
It's possible this is just larger recursive trees being generated after the subtree mutation, but we should also investigate this to make sure the mutation is working as intended. Especially since I don't see any explicit st.recursive or st.deferred strategies. (though there are other ways to construct strategies with multiple same-label substrategies).
Thanks for looking at this, I see you've picked up on dependabot triggering the failed run with the latest version.
Please let me know if there's anything I can do to help (although I have no appreciation of what goes on under the hood 😳).
Unless I'm misreading the stack traces, the MemoryError is not the root failure here — it happens when trying to repr the failing test's traceback or nodeid.
So the failure may be unrelated to memory consumption, it's just that it didn't fail before and hence wasn't repr'd.
I note, related or not, that we have had issues regarding the size of the repr of certain strategies. Perhaps recursive ones, I don't recall off-hand.
I've managed to work around this issue. I hope the following explanation might help identify the issue within hypothesis and/or help anyone who runs into a similar one...
I narrowed the issue down to tests using the end_minutes strategy. Although every test would pass if executed in isolation, when run as a suite the test run would crash. This could be avoided if two or more of the following tests that use end_minutes were skipped.
- test_daterange_start_end
- test_daterange_end_only_start_None
- test_daterange_duration_cal_end_minute
- test_daterange_duration_days_end_minute
- test_daterange_duration_intraday_end_minute
The number of tests that needed to be skipped depended on the specific tests being skipped, for example the suite would pass if only the following two tests were skipped:
or if the following three tests were skipped (but, I believe, not if only any two of these were skipped):
- test_daterange_duration_cal_end_minute
- test_daterange_duration_days_end_minute
- test_daterange_duration_intraday_end_minute
This all suggests to me that it was a memory error rather than related to the repr of a failing test.
I got around the issue with this PR which removes code from end_minutes that was computing a numpy array of 266400 (or 161610) integers within the strategy (lines L252 through L261). The work around now gets this array directly from a cached object (the 'calendar') in the same way that the preceding start_minutes strategy does.
One explanation for the MemoryError might therefore be that hypothesis is storing all values computed within the strategy each and every time the strategy code is executed...?
Cheers.