hypothesis When using class based tests, setUp is not called for each hypothesis test

I was just writing some test for a custom data structure and had a class based test with a setUp function that initialized a fresh instance for each test (so I don't have to copy the code in each test). Some test would fail randomly most but not all executions. After investigating this I found that setUp was simply not called for every hypothesis test which resulted in a "dirty" data structure and in term made some tests failing if the values came in the "wrong" order.

Here is an example:

class TestHypothesis(unittest.TestCase):

    def setUp(self):
        super(TestHypothesis, self).setUp()
        self.test_set = set()
        print "setUp called"

    @given(unicode)
    def test_example(self, text):
        chars = [c for c in text]
        for c in chars:
            assert c not in self.test_set
        self.test_set.update(chars)
        print "test called with", text

If I run this, I get the following output:

setUp called
test called with 
test called with \U0001bf50
test called with \U0004ac1e
test called with \U000d5c8f\U0002fb61\U00051be8\U000d5c8f\U0002fb61\U0002fb61\U00051be8\U000d5c8f\U00095c18\U0010a11f\U000d5c8f\U00051be8\U00095c18\U0002fb61\U000361af\U000d5c8f\U00019548\U000361af\U000d5c8f\U0010a11f\U000361af\U000d5c8f\U0002fb61\U000361af\U0010a11f\U0010a11f\U00095c18\U000361af\U000361af\U0010a11f\U0002fb61\U0010a11f\U000361af\U00095c18\U00019548\U000d5c8f\U000d5c8f\U00019548\U0002fb61\U0010a11f\U000361af\U00019548\U0010a11f\U00095c18\U000361af
test called with 0

[lots of other lines]

test called with \U00095c0f
Falsifying example: test_example(self=TestHypothesis(methodName='test_example'), text='\U00095c18')

The reason for this is obivous: Since setUp was only called once the data structure got "dirty".

I'm not really sure if this is something that needs to be fixed but I think it needs to be documented that you should not use setUp in this way (which in my opinion is perfectly fine) when using hypothesis.

Apr 16 '15 11:04 martinth

You're right, this is working as intended but needs to be better documented.

There's actually a feature for this. but I've now realised I haven't documented that either. Sorry.

If you use instead:

class TestHypothesis(unittest.TestCase):
    def setup_example(self):
        self.test_set = set()

This will be called before each example runs rather than before the whole function.

I'll leave this open until I've sorted the documentation.

Apr 16 '15 11:04 DRMacIver

Okay, then I'll be using setup_example.

Apr 16 '15 11:04 martinth

Is there a corresponding hook when using @pytest.fixture? I had expected (between discovering the scope= argument and finding this ticket!) that passing scope='function' would reconstitute the fixture for every test run, but the code below fails with the ValueError. (In my actual code, this caused Hypothesis to report the test as Flaky while trying to narrow the example, because it passed when re-run.)

from __future__ import division

import pytest
from hypothesis import given
import hypothesis.strategies as st


@pytest.fixture(scope='function')
def stateful():
    return set()


@given(x=st.integers())
def test_foo(stateful, x):
    if stateful:
        raise ValueError("function-scoped fixture reused!")

    stateful.add('kitten-fur gloves')
    assert x < 1000

Jul 23 '15 10:07 wjt

Sadly, no. As far as I can tell there's no way for me to integrate with py.test fixtures at the per example level.

Jul 23 '15 11:07 DRMacIver

@DRMacIver There is a way to parameterize tests in pytest using the pytest_generate_tests hook. Check this for example.

Aug 20 '15 23:08 thedrow

@thedrow the problem is that Hypothesis won't know how many tests it's going to run during the collection phase, it'll only know while running the tests. See https://github.com/pytest-dev/pytest/issues/916

Aug 21 '15 04:08 The-Compiler

@DRMacIver, thanks for mentioning setup_example! It enabled me to integrate Hypothesis into my unittest.TestCase-based test framework.

When you sort the documentation, a good example use of teardown_example is calling reset_mock on mock.Mock objects. I set-up my mock.Mock objects in the setUp phase, so reset_mock is necessary to enable the test runs after the first one to get a clean start.

Mar 13 '16 09:03 eli-b

It would be nice if hypothesis actually wrapped each example in setUp() and tearDown(). For example, consider this mixed test case:

class MixedTest(TestCase):

    def test_something_without_hypothesis(self):
        pass

    @given(foos())
    def test_something_with_hypothesis(self, foo):
        pass

As a developer, I want to wrap each test in my setup an teardown, but how I do so depends on how the test is implemented. If the test uses hypothesis, I use (setup|teardown)_example(). If the test doesn't use hypothesis, I use (setUp|tearDown). IMO, whether a test uses hypothesis or not should be an implementation detail.

This is currently leading to hacks like this:

class HypothesisTestCase(django.TestCase):

    def setup_example(self):
        self._pre_setup()

    def teardown_example(self, example):
        self._post_teardown()

    def __call__(self, result=None):
        testMethod = getattr(self, self._testMethodName)
        if getattr(testMethod, u'is_hypothesis_test', False):
            return unittest.TestCase.__call__(self, result)
        else:
            return dt.SimpleTestCase.__call__(self, result)

The problem will occur when adding hypothesis to any 3rd party test case that contains setUp and tearDown logic.

Jun 17 '16 14:06 etianen

There's a UX angle to this. As a total newbie with hypothesis, I did find it surprising that setUp wasn't called. I was just about to create an issue with the following example. It wasn't intuitive to me why the first test should fail, while the second doesn't.

from unittest import TestCase

from hypothesis import given
from hypothesis import strategies


class TestFoo(TestCase):

    def setUp(self):
        self.foo = 'meow'

    @given(strategies.none())
    def test_with_hypo(self, nada):
        assert self.foo == 'meow'
        self.foo = 'kitten'

    def test_no_hypo(self):
        assert self.foo == 'meow'
        self.foo = 'kitten'

Feb 03 '17 15:02 maiksprenger

Same issue and surprise here--I expected setUp() to get called before every hypothesis call and was surprised to see it wasn't.

Apr 18 '17 03:04 jdotjdot

I scribbled down “docs” when I saw this email at 5am. I don’t think we document this behaviour anywhere (or if we do, I can’t find it) – getting it documented would be a good first step.

Apr 18 '17 06:04 alexwlchan

@DRMacIver, thanks for mentioning setup_example! It enabled me to integrate Hypothesis into my unittest.TestCase-based test framework.

When you sort the documentation, a good example use of teardown_example is calling reset_mock on mock.Mock objects. I set-up my mock.Mock objects in the setUp phase, so reset_mock is necessary to enable the test runs after the first one to get a clean start.

Agreed. That is the exact use case I am using it for, and I would appreciate a "best practice" example

Sep 16 '20 19:09 awichmann-mintel

I stumbled over the issue again, now.

It may be OK to use setup_example if all testcases of a testclass uses hypethesis. But what happens if some testcases need hypothesis, others need parameterized others need no decorator at all? It's very confusing and disturbing that we have setUp with unittest, setup_method with pytest and setup_example with hypothesis.

Why does everybody cook his own soup?

Nov 17 '21 11:11 ghpqans

hypothesis hypothesis copied to clipboard

When using class based tests, setUp is not called for each hypothesis test

hypothesis
hypothesis copied to clipboard