pytest-xdist
pytest-xdist copied to clipboard
how to debug crashed gw worker?
Hi,
we are running our tests environment using pytest-xdist inside kubernetes pods. during the last three weeks we started suffering from crashed workers and we don't know how to debug what is the root cause of the failure.
I would kindly ask from someone to hint me on how to start looking at it, as I see only failures with crashed workers and not the reason why.
[gw8] Python 3.8.10 (default, May 26 2023, 14:05:08) -- [GCC 9.4.0]
13:33:50 [gw2] node down: Not properly terminated
13:33:50 [gw2] [ 50%] FAILED testsuite/ond/sdc/ono_oni_vz/ui/create_service/test_ond_41377_catalog_page_search.py::test_ond_41377_catalog_page_search
13:33:50
13:33:50 replacing crashed worker gw2
13:33:50
[gw9] linux Python 3.8.10 cwd: /home/jenkins/agent/workspace/QA/orion-orchestrator
13:33:50
13:33:51
Hi I'm using pytest-xdist version 3.5.0 experiencing the same problem, did you find a solution for it?
My first recommendation would be to put the resource usage of the pods into consideration
The second to ensure there's a actual init process that handles the signals from the k8s management, as a pytest process will not handle them