Intermittent pytest-xdist failures on Mac CI runners
Mac CI runners sometimes see worker crashes, e.g. worker 'gw2' crashed.... Seems to trace back to matplotlib's gca() and gcf(). The culprit is usually test_binarygrid_util.py::test_mfgrddisv_modelgrid, I'm not yet sure why.
Examples:
- https://github.com/modflowpy/flopy/runs/7748574375?check_suite_focus=true#step:9:1731
- https://github.com/modflowpy/flopy/runs/7734831141?check_suite_focus=true#step:9:1730
I think this may be related to a known pytest-xdist issue where tests almost always run in the main thread, but are not guaranteed to. Related discussions:
- https://github.com/pytest-dev/pytest-xdist/issues/469
- https://github.com/pytest-dev/pytest-xdist/issues/620
- https://github.com/pytest-dev/pytest-xdist/issues/739
- https://github.com/pytest-dev/execnet/issues/96
Matplotlib is not thread-safe, but it does not require the caller to be on the main thread. Perhaps there is a weird limitation on the Mac backend.
If above is the root cause, a possible workaround is to check if the test is on the main thread and skip if not, e.g.:
if threading.current_thread() is not threading.main_thread():
pytest.skip(reason="not on main thread")
An alternative could be to add a pytest marker and separate CI job just for testing plot functions and run them serially on CI. There aren't that many so it shouldn't increase CI runtimes much.
I will do some more digging.