hypothesis icon indicating copy to clipboard operation
hypothesis copied to clipboard

Request for more helpful `hypothesis.errors.FlakyStrategyDefinition:` error message when a `precondition` is flaky.

Open dcherian opened this issue 1 year ago • 2 comments

I had a hard time debugging a stateful test failure today:

state.delete_group_using_del(data=data(...))
state.check_list_prefix_from_root()
Checking 1 expected keys vs 1 actual keys
['zarr.json']
['zarr.json']
has uncommitted_changes: True
state.teardown()

Traceback (most recent call last):
  File "/Users/deepak/miniforge3/envs/icechunk/lib/python3.12/site-packages/hypothesis/core.py", line 1064, in _execute_once_for_engine
    result = self.execute_once(data)
             ^^^^^^^^^^^^^^^^^^^^^^^
...
  File "/Users/deepak/miniforge3/envs/icechunk/lib/python3.12/site-packages/hypothesis/internal/conjecture/datatree.py", line 1005, in draw_value
    inconsistent_generation()
  File "/Users/deepak/miniforge3/envs/icechunk/lib/python3.12/site-packages/hypothesis/internal/conjecture/datatree.py", line 52, in inconsistent_generation
    raise FlakyStrategyDefinition(
hypothesis.errors.FlakyStrategyDefinition: Inconsistent data generation! Data generation behaved differently between different runs. Is your data generation depending on external state?

As you can see, this traceback does not tell me why this run was "flaky". After quite some debugging, it turns out the the next rule the state machine expected to fire would not fire because of a precondition that was not satisified. This precondition was satisfied on a previous run, and the rule fired. Thus the flakiness.

It would be a lot more helpful if hypothesis could surface the fact that a precondition for a particular rule was flaky, or at least tell us what rule it expected to fire next.

dcherian avatar Dec 18 '24 21:12 dcherian

Thanks for reporting this - pointing out confusing or unhelpful error messages is really useful!

We choose which rule to run here, which in turn delegates to sampled_from(rules).filter(enabled) here. We don't have a good way to track which rule changed, but perhaps a general note would have helped?

try:
    rule = data.draw(st.sampled_from(self.rules).filter(rule_is_enabled))
except FlakyStrategyDefinition as err:
    err.add_note("Specifically, the expected rule could not run - this is usually due to a flaky predicate or an empty bundle.")
    raise

Zac-HD avatar Dec 19 '24 05:12 Zac-HD

A general note would be a big improvement. Is it feasible to add some debug level logging when doing the draw and filter?

dcherian avatar Dec 19 '24 13:12 dcherian

Amazing. thank you!

dcherian avatar Dec 01 '25 14:12 dcherian

Thank you for the issue! Feedback like this is so helpful for us.

And I guess thanks also to Claude, who implemented this and seven other PRs in one wild afternoon 🤯

Zac-HD avatar Dec 01 '25 17:12 Zac-HD