pytest-xdist
pytest-xdist copied to clipboard
worker 'gwX' crashed - internal error KeyError: <WorkerController gwX>
When running our test suite with 10 worker using pytest -n 10 --dist loadfile tests/, test execution sporadically fails with the following error message
internal error
failed on setup with "worker 'gwX' crashed while running 'tests/some_test.py::test_something'"
worker 'gwX' crashed while running 'tests/some_test.py::test_something'
and the following traceback
Traceback (most recent call last):
File "/data/venv/lib/python3.7/site-packages/_pytest/main.py", line 269, in wrap_session
session.exitstatus = doit(config, session) or 0
File "/data/venv/lib/python3.7/site-packages/_pytest/main.py", line 323, in _main
config.hook.pytest_runtestloop(session=session)
File "/data/venv/lib/python3.7/site-packages/pluggy/_hooks.py", line 265, in __call__
return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
File "/data/venv/lib/python3.7/site-packages/pluggy/_manager.py", line 80, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
File "/data/venv/lib/python3.7/site-packages/pluggy/_callers.py", line 55, in _multicall
gen.send(outcome)
File "/data/venv/lib/python3.7/site-packages/pluggy/_result.py", line 60, in get_result
raise ex[1].with_traceback(ex[2])
File "/data/venv/lib/python3.7/site-packages/pluggy/_callers.py", line 39, in _multicall
res = hook_impl.function(*args)
File "/data/venv/lib/python3.7/site-packages/xdist/dsession.py", line 112, in pytest_runtestloop
self.loop_once()
File "/data/venv/lib/python3.7/site-packages/xdist/dsession.py", line 135, in loop_once
call(**kwargs)
File "/data/venv/lib/python3.7/site-packages/xdist/dsession.py", line 256, in worker_collectionfinish
self.sched.schedule()
File "/data/venv/lib/python3.7/site-packages/xdist/scheduler/loadscope.py", line 341, in schedule
self._reschedule(node)
File "/data/venv/lib/python3.7/site-packages/xdist/scheduler/loadscope.py", line 323, in _reschedule
self._assign_work_unit(node)
File "/data/venv/lib/python3.7/site-packages/xdist/scheduler/loadscope.py", line 261, in _assign_work_unit
worker_collection = self.registered_collections[node]
KeyError: <WorkerController gwX>
with the X in gwX being the number of the worker that failed.
pytest related packages we use:
$ pip freeze | grep pytest
pytest==6.2.5
pytest-asyncio==0.15.1
pytest-cov==2.12.1
pytest-flask==1.2.0
pytest-forked==1.3.0
pytest-lazy-fixture==0.6.3
pytest-mock==3.6.1
pytest-xdist==2.3.0
python version:
$ python --version
Python 3.7.1
From my understanding it can happen that a test failure causes a worker to crash, in which case
pytest-xdist will [by default] automatically restart that worker and report the test's failure source
Could there be a bug in the restart logic which causes the KeyError to be thrown? Or is this expected in such a case?
Additionally, all the output we get from the failed test which caused the worker to crash is what I've shared above, no traceback of the actual test execution, how can I best debug this?