pytest
pytest copied to clipboard
Reference leaks caused by `@pytest.mark.parametrize` on newer Python versions
The combination of PyTest, and @pytest.mark.parametrize
causes reference leaks: by this, I mean that the objects parameterizing the test case are not always reliably freed by the time the Python interpreter has shut down. The behavior is somewhat erratic appears most noticeable with the latest Python 3.12.
Here is a basic example involving pure Python code:
import pytest
class Foo:
def __init__(self, value):
self.value = value
print(f'created {self.value}')
def __del__(self):
print(f'deleted {self.value}')
i1 = Foo(1)
i2 = Foo(2)
@pytest.mark.parametrize('i', [i1])
def test_foo(i):
pass
With this, I get (on Python 3.8):
$ python3.8 -m pytest foo.py --capture no
====================================== test session starts ======================================
platform darwin -- Python 3.8.18, pytest-7.4.4, pluggy-1.0.0
rootdir: /Users/wjakob
collecting ... created 1
created 2
collected 1 item
foo.py .
======================================= 1 passed in 0.00s =======================================
deleted 1
deleted 2
On Python 3.12, I get
$ python3.12 -m pytest foo.py --capture no
====================================== test session starts ======================================
platform darwin -- Python 3.12.1, pytest-7.4.4, pluggy-1.3.0
rootdir: /Users/wjakob
plugins: anyio-4.2.0
collecting ... created 1
created 2
collected 1 item
foo.py .
======================================= 1 passed in 0.00s =======================================
deleted 2
In other words, the Foo(1)
instance passed to @pytest.mark.parametrize
never had their __del__
method called.
One remark right away: the use of the _del__
method is of course considered bad practice in Python. I only used it to make a truly minimal example.
The larger context is as follows: I'm the author of the nanobind C++ <-> Python generator and co-author of pybind11. I want these tools to report object leaks in Python bindings, which can turn into a quite serious problem when the tooling provides no hints about such leaks taking place.
The problem is that C++ projects with bindings that use pytest in their test suite now report leaks that aren't the fault of these extensions but due to something weird happening with PyTest, specifically on newer Python versions.
There seems to be some issue related to how PyTest stores the @pytest.mark.parametrize
information that prevents Python's cyclic GC from being able to collect it before the interpreter shuts down.
the parameterize mark stores itself on a attribute of the test function
its strongly recommended not to use objects that at as/like resources as parameter values
my current understanding of the python memory model is that __del__
is gc, not resource management
my understanding of c++ is that RAII is deeply ingrained and mismatches all expectations one can have for python
however in general __del__
cannot be relied on in particular not for interpreter shutdown
so all pybind/nanobind managed objects that need resource management are not suitable as values in parameterize
To clarify: This isn't about resources like files, network connections. — naturally, it would be a bad idea to pass such objects to @pytest.mark.parameterize
. This also isn't about__del__
, C++ RAII, and features to run code when an object is freed. It's one level underneath.
There has been a huge amount of effort in CPython in the last years to make the interpreter 100% clean, in the sense that all non-immortal objects are freed when the interpreter shuts down. This is especially important when spawning sub-interpreters within the same process, since a few minor leaks can accumulate and create a severe problem over time.
My binding library nanobind
reports leaks to help developers avoid such issues. When the Python interpreter has shut down, any remaining Python binding-related objects not freed by the GC are highly suspect and indicative of a problem somewhere.
Which motivates this issue: something within PyTest's @pytest.mark.parametrize
breaks Python's GC, so that it is not able to clean up all objects. That's a memory leak, although a benign one in Pytest's regular command line usage. The more severe consequence is that having hundreds of false reports breaks the point of tracking and reporting refleaks in the first place.
I played a bit with this, and curiously clearing .pytestmark
before PyTest runs fixes the leak issue (though that then breaks the test suite). However, clearing it afterwards via atexit
does not remove the leak.
import pytest
class Foo:
def __init__(self, value):
self.value = value
print(f'created {self.value}')
def __del__(self):
print(f'deleted {self.value}')
i1 = Foo(1)
i2 = Foo(2)
@pytest.mark.parametrize('i', [i1])
def test_foo(i):
pass
def cleanup():
print('at cleanup')
del test_foo.pytestmark
import gc
gc.collect()
import atexit
atexit.register(cleanup)
.. which makes me wonder: could it be that PyTest stashes the .pytestmark
information somewhere else besides .pytestmark
when it runs?
Pytest Takes the markers into nodes as well as using parameter values as lookup keys
What you described is a resource management problem, I don't care what definition you wana use, if it's depends on GC cleanup it's wrong to put it into a parameterset
At exit is possibly too late
I recommend trying removal of the data in the teardown of a autouse fixture to get a better location for a initial hack
I'm assuming here that the objects being put into the parameter aren't resource handles that depend on being deallocated at any specific point. Something like an enum or a simple struct
containing some integers. I don't see any issue with putting something like that into parameterize
. It's fine for that to be deleted whenever.
But it should be deleted at some point. Leaks should always be avoided, even benign ones. That has motivated leak checkers for Python, valgrind, and lots of other tooling that we have to orderly bring down processes.
I want to dig more into what's happening in pytest but don't know where to start: what you described ("putting markers into nodes") and using them as lookup keys: could you point me to the data structures responsible (pytest.???
)? Any help is appreciated!
Thanks, Wenzel
https://github.com/pytest-dev/pytest/blob/main/src/_pytest/python.py#L1166 is the starting point of taking stuff from parametrize to test parameters
its passed into python functions via https://github.com/pytest-dev/pytest/blob/main/src/_pytest/python.py#L1706C12-L1706C12
and https://github.com/pytest-dev/pytest/blob/main/src/_pytest/python.py#L476 handles the ungodly tie-up