pytest-xdist
pytest-xdist copied to clipboard
Constantly hanging test run with the plugin
Hello,
We are using:
platform linux -- Python 3.9.5, pytest-6.2.5, py-1.10.0, pluggy-0.13.1
plugins: forked-1.4.0, xdist-2.5.0, pytest_check-1.0.4, teamcity-messages-1.29, anyio-3.3.4, testrail-2.9.1, dependency-0.5.1
When trying to execute pytest using xdist on remote windows host by loadfile get hanging test run.
The command:
python3 -m pytest -vv --dist=loadfile --tx ssh=admin@test-host-ip --rsyncdir /tmp/autotests_rsync C:\\users\\admin\\pyexecnetcache\\autotests_rsync\\autotests\\testsuite\\positive
The hanging appears in test, which using waiter to get value from postgresql db via SQLAlchemy ORM.
We're passing value from test suite to the following test:
start_time = Waiter.wait_new(lambda: DbTestData.get_session_records_column_by_record_id(
DbTestData.start_time, record_id)[0][0],
check_func=CheckFunctions.check_none,
error_message=f"Error")
assert start_time is not None, f"Record start_time in db = {start_time}, expected not None"
def query(*args):
session = SessionHolder.get_session()
result = session.query(*args)
session.commit()
return result
which using this waiter:
@staticmethod
def wait_new(func: Callable, check_func: Callable = CheckFunctions.check_empty, timeout_value: int = 20,
timeout_interval: int = 1, error_message: str = ""):
print(f"Func = {func}")
value = waiter_exception
exc_raise_if_fail = TestWaiterException()
timeout = 0
in_while = True
Logger.utils_logger.debug(f"timeout_value = {timeout_value}, timeout_interval = {timeout_interval})")
while in_while:
print(f"in_while loop")
try:
print(f"Trying execute func")
value = func()
except Exception as ex:
print(f"Exception")
if timeout == timeout_value:
exc_raise_if_fail.with_traceback(sys.exc_info()[2])
exc_raise_if_fail.txt += ": " + ex.args[0]
in_while = False
value = waiter_exception
Logger.utils_logger.debug(f"Exception", exc_info=True)
finally:
print(f"Finally")
Logger.utils_logger.debug(f"Current value: {value}")
if (timeout > timeout_value) or (value != waiter_exception and not check_func(value)):
print(f"Break")
break
else:
print(f"Else")
timeout += timeout_interval
time.sleep(timeout_interval)
if value == waiter_exception:
Logger.utils_logger.critical(f"{exc_raise_if_fail.txt}, {error_message}")
raise exc_raise_if_fail
return value
It just hangs permanently while executing waiter only when we using xdist plugin.
hi i took the liberty to make the code block multi line
does it only hang on windows hosts, or also on linux hosts?
it is possible/thinkable that the execmodel code which was added creates an issue by having a remote_exec not run on the main thread
do your utilities require running in the main thread by chance?
also for context - execmodel is something in execnet itself
to verify the issue one may need to downgrade execnet to a version older than execnet 1.2 from before 2014
also for context - execmodel is something in execnet itself
to verify the issue one may need to downgrade execnet to a version older than execnet 1.2 from before 2014
Thank you for your reply. I tried to downgrade execnet to version 1.2, but the problem remained, the tests also hang.
version 1.2 is the first version with the supposed issue, please try even older
We tried to run tests on version 1.1, but the problem remained. We also use execnet in fixtures before test, it works fine.
then rigth now, im unaware of what causes them, i presume thee is no known simplified reproducer
We did some investigation, the hanging appears while executing any execnet script on remote host. We created test class to try how it executes, it also hanging:
` class TestLogger:
@pytest.fixture(scope="class")
def testing_loggger(self):
try:
print(f"user: {reserve_user}, host: {reserve_host}")
gw = execnet.makegateway(f"ssh={reserve_user}@{reserve_host}//python=python3.9")
channel = gw.remote_exec("""
try:
import os, traceback, logging
logging.basicConfig(level=logging.DEBUG, filename=f"C:/test_artifacts/reserve_station_logs/reserve_data_files_sizes.log", filemode='w', format='%(asctime)s - %(levelname)s - %(message)s', datefmt='%d-%m-%Y %H:%M:%S')
flow = None
dirs = []
audio_data_files_sizes = {}
logging.debug("Something")
channel.send(("a", "b"))
except Exception as ex:
logging.error("Exception", exc_info=True)
channel.send(ex)
""")
audio_data_files_sizes, video_files_sizes = channel.receive()
print(audio_data_files_sizes, video_files_sizes)
except Exception:
Logger.tests_logger.error("Fixture testing_loggger", exc_info=True)
pytest.skip("Fixture testing_loggger failure")
def test_log(self, testing_loggger):
assert True
`
Here we used execnet version 1.1 , 1.0.3, 1.0.5 and launch tests on one worker.
Then it's possibly a execnet bug, i have a larger change to it in the works but that's months away from landing
Hi! @RonnyPfannschmidt Sorry for the intrusiveness, but I want to know if there are any changes on this issue?
Unfortunately not
Hi! I have the same issue. Does this problem have been already solved?