hypothesis RuleBasedStateMachine is prone to Unsatisfiable errors

A state machine frequently ends up as hypothesis.errors.Unsatisfiable when the input strategies to its rules themselves are frequently marked as invalid.

For example,

class MyStateMachine(RuleBasedStateMachine):

    @rule(data=st.lists(st.text(), min_size=5, unique=True))
    def rule1(self, data):
        assert data is not None


TestMyStateMachine = MyStateMachine.TestCase

Yields:

E   hypothesis.errors.Unsatisfiable: Unable to satisfy assumptions of run_state_machine

venv/lib/python3.10/site-packages/hypothesis/stateful.py:112: Unsatisfiable
--------------------------------------------------------------------------------------------------------- Hypothesis ---------------------------------------------------------------------------------------------------------
You can add @seed(187837441642656874040035655188699191288) to this test or run pytest with --hypothesis-seed=187837441642656874040035655188699191288 to reproduce this failure.
=================================================================================================== Hypothesis Statistics ====================================================================================================
myproj/test_hypothesis.py::TestMyStateMachine::runTest:

  - during generate phase (31.27 seconds):
    - Typical runtimes: ~ 0-55 ms, of which < 1ms in data generation
    - 0 passing examples, 0 failing examples, 1000 invalid examples
    - Events:
      * 56.80%, Retried draw from text().filter(not_yet_in_unique_list) to satisfy filter
      * 43.20%, Aborted test because unable to satisfy just(Rule(targets=(), function=rule1, arguments={'data': lists(text(), min_size=5, unique=True)}, preconditions=(), bundles=())).filter(RuleStrategy(machine=MyStateMachine({...})).is_valid).filter(lambda r: <unknown>)

  - Stopped because settings.max_examples=100, but < 10% of examples satisfied assumptions

This is the minimal example I could find, my actual state machine is much larger but exhibits the same error.

The state machine works fine when used with simpler or more reliable strategy. '

The st.lists(st.text(), min_size=5, unique=True) strategy also fine when used with @given (i.e. not in a state machine), although the stats show that it does frequently return invalid examples.

Apr 22 '23 20:04 levand

@Zac-HD I see why you tagged this performance but I do want to note that this is a pretty big obstacle to us being able to use state machines correctly.

In order to work around this issue, we need to make sure that all our composite strategies rarely or never mark examples as invalid; i.e, we basically cannot use assume or filter, or any of the built-in strategies that leverage filtering.

Specifically, we have to do the opposite of what the Hypothesis docs recommend for composite strategies: https://hypothesis.readthedocs.io/en/latest/data.html#composite-strategies. In the reimplementing_sets_strategy example, we specifically need to do things the "bad" way, since the "good" way means we run into this issue whenever we try to use the strategy in a state machine.

Apr 24 '23 15:04 levand

This turned out to be pretty simple in the end! I thought that https://github.com/HypothesisWorks/hypothesis/pull/3894 might have helped, but it didn't at all - and at that point I was pretty confident that it wasn't actually due to filtering too much at all. Instead, it turned out to be trying to generate too much data; the fix is simply to stop taking additional steps if we've already generated 80% as much data as is possible. (we could tune it more precisely based on how many steps we've taken so far, but it's not really worth the trouble)

So why did avoiding filters seem to help? My best guess is that it's because your filter-free strategies (a) don't generate-and-discard when attempting to satisfy the filter, and (b) might generate smaller and simpler inputs overall. Happily, you'll now be able to use the same strategies across @given()-based and stateful testing 😁

Mar 12 '24 07:03 Zac-HD

hypothesis hypothesis copied to clipboard

RuleBasedStateMachine is prone to Unsatisfiable errors

hypothesis
hypothesis copied to clipboard