pytest-xdist icon indicating copy to clipboard operation
pytest-xdist copied to clipboard

Fixture teardown doesnt always complete

Open niroa opened this issue 5 years ago • 8 comments

Hi,

I have a number of fixtures that sometimes can take a bit of time to finish both setting up and tearing down, I keep seeing inconsistencies in the teardown completing, i recreated the following simple test

if the sleep inside the teardown of the fixture is anything above 5 seconds, the file reflecting teardown-step2 is never created, Is this expected behavior? is there some sort of timeout on failure before the test execution is aborted completely in the worker after a failure?

The code

import pytest
import tempfile
import os
import logging
import time
logger = logging.getLogger(__file__)

@pytest.fixture(scope="class")
def cleandir():
    newpath = tempfile.mkdtemp()
    os.chdir(newpath)
    yield
    logger.warning("Doing Teardown")
    open("teardown-step1", "w")
    time.sleep(20)
    open("teardown-step2", "w")


@pytest.mark.usefixtures("cleandir")
class TestDirectoryInit:
    def test_cwd_starts_empty(self):
        assert os.listdir(os.getcwd()) == []
        with open("myfile", "w") as f:
            f.write("hello")

    def test_cwd_again_starts_empty(self):
        logger.info(os.getcwd())
        logger.info(os.listdir(os.getcwd()))
        assert os.listdir(os.getcwd()) == []

The output

$ pytest -s -v -x -n=5 --log-cli-level=INFO --dist=loadscope ./e2etest/tests/pydisttest1.py
============================= test session starts =============================
platform win32 -- Python 3.8.1, pytest-5.3.5, py-1.8.1, pluggy-0.13.1 -- c:\users\niroa\appdata\local\programs\python\python38\python.exe
cachedir: .pytest_cache
rootdir: E:\repos2\datareplication-service-nodms\datareplication-main-nodms
plugins: forked-1.1.3, logger-0.5.1, xdist-1.32.0
gw0 I / gw1 I / gw2 I / gw3 I / gw4 I
[gw0] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw1] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw2] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw3] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw4] win32 Python 3.8.1 cwd: E:\repos2\xxx
[gw0] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw1] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw2] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw3] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
[gw4] Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 23:11:46) [MSC v.1916 64 bit (AMD64)]
gw0 [2] / gw1 [2] / gw2 [2] / gw3 [2] / gw4 [2]

scheduling tests via LoadScopeScheduling

e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_starts_empty
[gw0] PASSED e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_starts_empty
e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_again_starts_empty
[gw0] FAILED e2etest/tests/pydisttest1.py::TestDirectoryInit::test_cwd_again_starts_empty

================================== FAILURES ===================================
________________ TestDirectoryInit.test_cwd_again_starts_empty ________________
[gw0] win32 -- Python 3.8.1 c:\users\xxx\appdata\local\programs\python\python38\python.exe

self = <tests.pydisttest1.TestDirectoryInit object at 0x0000029D2BFFBA00>

    def test_cwd_again_starts_empty(self):
        logger.info(os.getcwd())
        logger.info(os.listdir(os.getcwd()))
>       assert os.listdir(os.getcwd()) == []
E       AssertionError: assert ['myfile'] == []
E         Left contains one more item: 'myfile'
E         Full diff:
E         - ['myfile']
E         + []

E:\repos2\xxx\e2etest\tests\pydisttest1.py:29: AssertionError
------------------------------ Captured log call ------------------------------
INFO     E:\repos2\xxx\e2etest\tests\pydisttest1.py:pydisttest1.py:27 C:\Users\xxx\AppData\Local\Temp\tmpu66u09mc
INFO     E:\repos2\xxx\e2etest\tests\pydisttest1.py:pydisttest1.py:28 ['myfile']
!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!! xdist.dsession.Interrupted: stopping after 1 failures !!!!!!!!!!!!
======================== 1 failed, 1 passed in 11.12s =========================

niroa avatar May 29 '20 01:05 niroa

I have seen notice exactly the same issue when investigating my tests

If i have teardown that lasts longer than 5 sec, it is being stopped. If i run it without xdist it works fine.

pytest: 7.3.1 xdist: 3.3.1

example code:

import time

@pytest.fixture(scope="module")
def setup_teardown_main(request):
    log.debug(f"SETUP MAIN")
    yield
    log.debug(f"START TEARDOWN MAIN")
    for i in range(10):
        time.sleep(1)
        log.debug(f"sleeping: {i}")

    log.debug(f"FINISH TEARDOWN MAIN")


@pytest.fixture(scope="function")
def setup_teardown_test(request, setup_teardown_main):
    log.debug(f"SETUP TEST")
    yield
    log.debug(f"TEARDOWN TEST")


@pytest.mark.parametrize(
    "tc_id",
    [
        ("TCASE-00001"),
        ("TCASE-00002"),
        ("TCASE-00003"),
    ],
)
def test_create(
    tc_id,
    setup_teardown_test,
):
    log.debug("TEST")
    if tc_id == "TCASE-00002":
        assert False

execution: pytest -n 2 test_sandbox.py -vv -s -o log_cli=true

output:

2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_main 151 :: SETUP MAIN
2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_test 163 :: SETUP TEST
2023-08-11 17:47:58,401 DEBUG test_sandbox.py test_create 180 :: TEST
2023-08-11 17:47:58,401 DEBUG test_sandbox.py setup_teardown_test 165 :: TEARDOWN TEST
2023-08-11 17:47:58,402 DEBUG test_sandbox.py setup_teardown_test 163 :: SETUP TEST
2023-08-11 17:47:58,402 DEBUG test_sandbox.py test_create 180 :: TEST
2023-08-11 17:47:58,403 DEBUG test_sandbox.py setup_teardown_test 165 :: TEARDOWN TEST
2023-08-11 17:47:58,403 DEBUG test_sandbox.py setup_teardown_main 153 :: START TEARDOWN MAIN
2023-08-11 17:47:59,408 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 0
2023-08-11 17:48:00,411 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 1
2023-08-11 17:48:01,415 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 2
2023-08-11 17:48:02,421 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 3
2023-08-11 17:48:03,423 DEBUG test_sandbox.py setup_teardown_main 156 :: sleeping: 4

EDIT: The same behavior when using -n 1 so basically using just one worker

pajhiwo avatar Aug 11 '23 16:08 pajhiwo

This is an extremely concerning issue. I've been experiencing a similar problem. How has this not been addressed since 2020?

dylan-at-na avatar Aug 14 '23 09:08 dylan-at-na

@nicoddemus @hpk42 you're the top contributors to this repo. Would you be able to weigh in on the above issue? It seems to be a fundamental flaw, and far too high impact to have fallen through the cracks between now and 2020.

dylan-at-na avatar Aug 14 '23 09:08 dylan-at-na

i believe its related to https://github.com/pytest-dev/pytest-xdist/blob/bfd28ff39804856afe779d1acf82c2abbfe1dbe3/src/xdist/workermanage.py#L35 being the gateway teardown limit

off the hand its not quite clear how much time it would take to reshape it to be more graceful

RonnyPfannschmidt avatar Aug 14 '23 12:08 RonnyPfannschmidt

@RonnyPfannschmidt thx for the info. It seems it only if affected when test fails. If i have tests finishing correctly, above example with sleeping 10 sec works fine, but once it fails it being cut to 5 sec.

Is there a way for node to wait until teardown is executed fully ? as if not.....i think it means xdist cannot handle more complex teardown at all (in case of failures)

pajhiwo avatar Aug 14 '23 13:08 pajhiwo

currently its not gracefull if node fail and termination triggers

RonnyPfannschmidt avatar Aug 14 '23 13:08 RonnyPfannschmidt

@RonnyPfannschmidt I did try to change value of EXIT_TIMEOUT inside the pytest_xdist_setupnodes hook and set it to 100, by:

dsession = config.pluginmanager.getplugin("dsession")
dsession.nodemanager.EXIT_TIMEOUT = 100
# i was able to see it being setup by
print(config.pluginmanager.getplugin("dsession").nodemanager.EXIT_TIMEOUT)
# shows 100

but still...for failing tests it stops after 5 sec, so i do not think that EXIT_TIMEOUT is what set it up

pajhiwo avatar Aug 14 '23 14:08 pajhiwo

@pajhiwo thanks for validating, in that case i believe there is another timeout missed, this may be either in execnet or in xdist

RonnyPfannschmidt avatar Aug 14 '23 14:08 RonnyPfannschmidt