pygmt
pygmt copied to clipboard
How to resolve flaky tests resulting from using a single GMT session
Description of the problem
There's been instances of flaky tests in PyGMT's test suite reported in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-825561365. This likely stems from the fact that PyGMT uses a single GMT session (initiated during import pygmt) instead of separate GMT sessions for each figure (see https://github.com/GenericMappingTools/pygmt/pull/327#issuecomment-541782890).
@meghanrjones asked about whether we should stick with using a single GMT session in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827230998, or use independent sessions per figure
I understand the original logic behind a single GMT session for all tests in https://github.com/GenericMappingTools/pygmt/pull/327#issuecomment-541782890. Still, I don't expect that users will be attempting to use the entire PyGMT library in a single session, which is the goal of the test suite. So I think it would be worth revisiting this decision. Could it be possible to periodically test the examples/tutorials against baseline images to ensure that producing multiple plots in a single session is consistent and have the unit tests each use individual sessions?
Full code that generated the error
Flaky tests are hard to reproduce (that is their definition actually), but in PyGMT's case, can be found e.g. when a single test passing on pytest pygmt/tests/test_somemodule.py fails when ran using make test, or vice versa.
E.g. as reported by @meghanrjones in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-825753479
edit: I have not yet been able to figure out a solution. The two makecpt tests fail if there is a docstring example that imports pygmt and instantiates a figure (e.g.,
extract_region()inpygmt/clib/session.pyandpygmt/src/grdfilter.py) and is tested beforepygmt/tests/test_makecpt.py.
Related issues affected by having a single GMT session:
- #217
- https://github.com/GenericMappingTools/pygmt/issues/372#issuecomment-551855032
- #733
- #1582
System information
Please paste the output of python -c "import pygmt; pygmt.show_versions()":
PyGMT information:
version: v0.3.2.dev117+g7466dc31
System information:
python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0]
executable: ~/username/miniconda3/envs/pygmt/bin/python
machine: Linux-5.4.0-72-generic-x86_64-with-debian-bullseye-sid
Dependency information:
numpy: 1.17.1
pandas: 1.2.3
xarray: 0.17.0
netCDF4: 1.5.6
packaging: 20.9
ghostscript: 9.53.3
gmt: 6.2.0rc1
GMT library information:
binary dir: ~/username/miniconda3/envs/pygmt/bin
cores: 6
grid layout: rows
library path: ~/username/miniconda3/envs/pygmt/lib/libgmt.so
padding: 2
plugin dir: ~/username/miniconda3/envs/pygmt/lib/gmt/plugins
share dir: ~/username/miniconda3/envs/pygmt/share/gmt
version: 6.2.0rc1
Ok, the flakiness appears to have been an upstream GMT issue that was fixed in https://github.com/GenericMappingTools/gmt/pull/3344. There are some tests that are wrong but currently passing (i.e. false positives) identified in https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827847551 and https://github.com/GenericMappingTools/pygmt/issues/1217#issuecomment-827847551 that need to be updated once we bump to GMT 6.2.0rc2.
- [x] pygmt/tests/test_subplot.py #1291
- [x] pygmt/tests/test_text.py #1292
- [x] pygmt/tests/test_wiggle.py #1291
The past few flaky tests revealing GMT bugs have convinced me of the usefulness of the current structure, even though it would be nice to have the option to run tests in parallel.
We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with : ..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%] ..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%] due to issues with the remote file.
We seem to be semi-regularly getting failures on windows-latest - Python 3.7 / NumPy 1.18 with : ..\tests\test_sph2grd.py::test_sph2grd_outgrid FAILED [ 87%] ..\tests\test_sph2grd.py::test_sph2grd_no_outgrid FAILED [ 87%] due to issues with the remote file.
Yes this has been popping up recently, but I don't think this is related to flakiness in a single GMT session since the error is Error: [ERROR]: Libcurl Error: Timeout was reached, so maybe open a separate issue for this.
https://forum.generic-mapping-tools.org/t/memory-temporary-storage-issues/5256 This post is a good example showing that using a single GMT session sometimes causes issues.