pytest-xdist
pytest-xdist copied to clipboard
Fixture teardown doesnt always complete
Hi,
I have a number of fixtures that sometimes can take a bit of time to finish both setting up and tearing down, I keep seeing inconsistencies in the teardown completing, i recreated the following simple test
if the sleep inside the teardown of the fixture is anything above 5 seconds, the file reflecting teardown-step2 is never created, Is this expected behavior? is there some sort of timeout on failure before the test execution is aborted completely in the worker after a failure?
The code
import pytest
import tempfile
import os
import logging
import time
logger = logging.getLogger(__file__)
@pytest.fixture(scope="class")
def cleandir():
newpath = tempfile.mkdtemp()
os.chdir(newpath)
yield
logger.warning("Doing Teardown")
open("teardown-step1", "w")
time.sleep(20)
open("teardown-step2", "w")
@pytest.mark.usefixtures("cleandir")
class TestDirectoryInit:
def test_cwd_starts_empty(self):
assert os.listdir(os.getcwd()) == []
with open("myfile", "w") as f:
f.write("hello")
def test_cwd_again_starts_empty(self):
logger.info(os.getcwd())
logger.info(os.listdir(os.getcwd()))
assert os.listdir(os.getcwd()) == []
The output
$ pytest -s -v -x -n=5 --log-cli-level=INFO --dist=loadscope ./e2etest/tests/pydisttest1.py
============================= test session starts =============================
platform win32 -- Python 3.8.1, pytest-5.3.5, py-1.8.1, pluggy-0.13.1 -- c:\users\niroa\appdata\local\programs\python\python38\python.exe
cachedir: .pytest_cache
rootdir: E:\repos2\datareplication-service-nodms\datareplication-main-nodms
plugins: forked-1.1.3, logger-0.5.1, xdist-1.32.0
gw0 I / gw1 I / gw2 I / gw3 I / gw4 I
[gw0] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw1] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw2] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw3] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw4] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw0] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw1] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw2] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw3] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw4] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
gw0 [2] / gw1 [2] / gw2 [2] / gw3 [2] / gw4 [2]
scheduling tests via LoadScopeScheduling
e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_starts_empty
[gw0] PASSED e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_starts_empty
e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_again_starts_empty
[gw0] FAILED e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_again_starts_empty
================================== FAILURES ===================================
________________ TestDirectoryInit.test_cwd_again_starts_empty ________________
[gw0] win32 -- Python 3.8.1 c:\users\xxx\appdata\local\programs\python\python38\python.exe
self = <tests.pydisttest1.TestDirectoryInit object at 0x0000029D2BFFBA00>
def test_cwd_again_starts_empty(self):
logger.info(os.getcwd())
logger.info(os.listdir(os.getcwd()))
> assert os.listdir(os.getcwd()) == []
E AssertionError: assert ['myfile'] == []
E Left contains one more item: 'myfile'
E Full diff:
E - ['myfile']
E + []
E:\repos2\xxx\e2etest\tests\pydisttest1.py:29: AssertionError
------------------------------ Captured log call ------------------------------
INFO E:\repos2\xxx\e2etest\tests\pydisttest1.py:pydisttest1.py:27 C:\Users\xxx\AppData\Local\Temp\tmpu66u09mc
INFO E:\repos2\xxx\e2etest\tests\pydisttest1.py:pydisttest1.py:28 ['myfile']
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!
======================== 1 failed, 1 passed in 11.12s =========================
I have seen notice exactly the same issue when investigating my tests
If i have teardown that lasts longer than 5 sec, it is being stopped. If i run it without xdist it works fine.
pytest: 7.3.1 xdist: 3.3.1
example code:
import time
@pytest.fixture(scope="module")
def setup_teardown_main(request):
log.debug(f"SETUP MAIN")
yield
log.debug(f"START TEARDOWN MAIN")
for i in range(10):
time.sleep(1)
log.debug(f"sleeping: {i}")
log.debug(f"FINISH TEARDOWN MAIN")
@pytest.fixture(scope="function")
def setup_teardown_test(request, setup_teardown_main):
log.debug(f"SETUP TEST")
yield
log.debug(f"TEARDOWN TEST")
@pytest.mark.parametrize(
"tc_id",
[
("TCASE-00001"),
("TCASE-00002"),
("TCASE-00003"),
],
)
def test_create(
tc_id,
setup_teardown_test,
):
log.debug("TEST")
if tc_id == "TCASE-00002":
assert False
execution:
pytest -n 2 test_sandbox.py -vv -s -o log_cli=true
output:
2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_main 151 :: SETUP MAIN
2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_test 163 :: SETUP TEST
2023-08-11 17:47:58,401 DEBUG test_sandbox.py test_create 180 :: TEST
2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_test 165 :: TEARDOWN TEST
2023-08-11 17:47:58,402 DEBUG test_sandbox.py setup_teardown_test 163 :: SETUP TEST
2023-08-11 17:47:58,402 DEBUG test_sandbox.py test_create 180 :: TEST
2023-08-11 17:47:58,403 DEBUG test_sandbox.py setup_teardown_test 165 :: TEARDOWN TEST
2023-08-11 17:47:58,403 DEBUG test_sandbox.py setup_teardown_main 153 :: START TEARDOWN MAIN
2023-08-11 17:47:59,408 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 0
2023-08-11 17:48:00,411 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 1
2023-08-11 17:48:01,415 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 2
2023-08-11 17:48:02,421 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 3
2023-08-11 17:48:03,423 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 4
EDIT:
The same behavior when using -n 1 so basically using just one worker
This is an extremely concerning issue. I've been experiencing a similar problem. How has this not been addressed since 2020?
@nicoddemus @hpk42 you're the top contributors to this repo. Would you be able to weigh in on the above issue? It seems to be a fundamental flaw, and far too high impact to have fallen through the cracks between now and 2020.
i believe its related to https://github.com/pytest-dev/pytest-xdist/blob/bfd28ff39804856afe779d1acf82c2abbfe1dbe3/src/xdist/workermanage.py#L35 being the gateway teardown limit
off the hand its not quite clear how much time it would take to reshape it to be more graceful
@RonnyPfannschmidt thx for the info. It seems it only if affected when test fails. If i have tests finishing correctly, above example with sleeping 10 sec works fine, but once it fails it being cut to 5 sec.
Is there a way for node to wait until teardown is executed fully ? as if not.....i think it means xdist cannot handle more complex teardown at all (in case of failures)
currently its not gracefull if node fail and termination triggers
@RonnyPfannschmidt
I did try to change value of EXIT_TIMEOUT inside the pytest_xdist_setupnodes hook and set it to 100, by:
dsession = config.pluginmanager.getplugin("dsession")
dsession.nodemanager.EXIT_TIMEOUT = 100
# i was able to see it being setup by
print(config.pluginmanager.getplugin("dsession").nodemanager.EXIT_TIMEOUT)
# shows 100
but still...for failing tests it stops after 5 sec, so i do not think that EXIT_TIMEOUT is what set it up
@pajhiwo thanks for validating, in that case i believe there is another timeout missed, this may be either in execnet or in xdist