CI: Add a Cygwin run to GHA CI.
PR Summary
Add a Cygwin run to GHA, as suggested by https://github.com/matplotlib/matplotlib/issues/22997#issuecomment-1120064419
PR Checklist
Tests and Styling
- [ ] Has pytest style unit tests (and
pytestpasses). - [ ] Is Flake 8 compliant (install
flake8-docstringsand runflake8 --docstring-convention=all).
Documentation
- [ ] New features are documented, with examples if plot related.
- [ ] New features have an entry in
doc/users/next_whats_new/(follow instructions in README.rst there). - [ ] API changes documented in
doc/api/next_api_changes/(follow instructions in README.rst there). - [ ] Documentation is sphinx and numpydoc compliant (the docs should build without error).
I think that libjpeg-devel is also required (as you probably have figured out as well).
I can try that and see if it works (I shouldn't run into python-pillow/pillow#6216)
The build is now failing in the expected place, so let's see if the suggestion in https://github.com/matplotlib/matplotlib/issues/22997#issuecomment-1120195190 allows the build to finish.
The build is succeeding now, but the tests fail with ImageComparisonErrors due to using the wrong version of FreeType. Let's see if that compiles now.
The freetype compile still fails during configure; the closest Cygwin build script I can find is here. The first patch doesn't look like the kind of thing to fix a configure crash (probably the same as this one), and the name of the second patch doesn't look relevant.
The only other thing I can think of is running autoconf/autoreconf.
The Cygwin CI is still failing in FreeType configure: https://github.com/matplotlib/matplotlib/runs/6436739391?check_suite_focus=true#step:11:756
I can get the compilation done using system FreeType and QHull, but building the FreeType version needed for the tests to pass fails, either during the libtoolize step of autoreconf or during configure without that step.
Is there a way to skip all the tests that require an exact image match? It wouldn't provide a thorough test of Matplotlib on Cygwin, but most of the differences from Windows and Linux are in compiling and importing the extensions, so that might be enough.
The bundled FreeType build fails even if I push through after the libtoolize failure. I can try again without the bundled FreeType, and see if I can eliminate the fork() failures, even if the ImageComparisonFailures stick around.
I don't know why upload-artifact fails; the previous step should give everyone read and write permissions to everything in that directory and also execute permissions to every sub-directory.
I haven't seen that error message on a fork() failure before:
1 [main] python3.8 (5008) C:\cygwin\bin\python3.8.exe: *** fatal error in forked process - MEM_COMMIT failed, Win32 error 1455
13465 [main] python3.8 (5008) cygwin_exception::open_stackdumpfile: Dumping stack trace to python3.8.exe.stackdump
2003466 [main] python 2011 dofork: child -1 - forked process 5008 died unexpectedly, retry 0, exit code 0x100, errno 11
The upload-artifact step is also still failing, and I have no idea why.
Do Windows ReadAndExecute permissions not imply Read or ReadPermissions? I wouldn't think upload-artifact would need Write, ExecuteFile, or delete permissions.
And now the FreeType build is failing with no apparent reason given.
I do not understand all the extra work that needs to be done here. On Cygwin x86_64, I installed Python 3.9, gcc-c++, make, and I think python39-matplotlib, and was able to build FreeType just fine (with #23066). I did have trouble with downloading qhull, which I have not yet investigated.
I was also able to build FreeType locally, but it still fails in CI, for reasons the error message does not seem to explain. Trying to narrow down the cause of that failure is the bulk of the commits.
fatal error in forked process - MEM_COMMIT failed, Win32 error 1455
Well that's a new kind of fork() failure.
The most recent commit should have stopped trying to install wxPython, so I'm not sure what to do about the six failing MPLBACKEND: wxagg tests
From this cryptic (truncated!) message:
stderr = 'Traceback (most recent call last):\n File "<string>", line 1, in <module>\n File "/cygdrive/d/a/matplotlib/matplotl....8/site-packages/wx/core.py", line 12, in <module>\n from ._core import *\nImportError: No such file or directory\n'
I think that wx is still installed enough to pass our "can we test this" filter (it looks like there is a wx/core.py is site-packages?!) but not enough to actually run (the _core (I assume) c-extensions are mssing).
https://github.com/matplotlib/matplotlib/blob/0aac9f17b9373f66c709f436149568bda69595b3/lib/matplotlib/tests/test_backends_interactive.py#L24-L68
This logic can not actually import the GUI toolkits because importing more than one can lead to all manner of trouble so we check with find_spec and then run the actual tests is a sub-process (sorry if I am over explaining obvious things).
It looks like it is coming from the system packages?
15 install python3-wx-common 4.0.7.post2-4
...
401 install python38-wx 4.0.7.post2-4
https://github.com/matplotlib/matplotlib/actions/runs/3606263992/jobs/6077436300#step:4:104 https://github.com/matplotlib/matplotlib/actions/runs/3606263992/jobs/6077436300#step:4:490
FAILED lib/matplotlib/tests/test_matplotlib.py::test_tmpconfigdir_warning - AssertionError: assert 'set the MPLCONFIGDIR' in ''
634
+ where '' = CompletedProcess(args=['/usr/bin/python', '-c', 'import matplotlib'], returncode=0, stderr='').stderr
635
FAILED lib/matplotlib/tests/test_pyplot.py::test_pylab_integration - BlockingIOError: [Errno 11] Resource temporarily unavailable
636
FAILED lib/matplotlib/tests/test_simplification.py::test_throw_rendering_complexity_exceeded
The tempdir one is already skiped on nt, I assume cygwin has the same limitations and should be skipped?
The rendering complexity one may be an actual bug that is testing that c code bails on too-big things. https://github.com/matplotlib/matplotlib/issues/21789 seems related and the most recent PR to touch this code was https://github.com/matplotlib/matplotlib/pull/19343.
I have no good theory on the pylab one.
FAILED lib/matplotlib/tests/test_matplotlib.py::test_tmpconfigdir_warning - AssertionError: assert 'set the MPLCONFIGDIR' in '' 634 + where '' = CompletedProcess(args=['/usr/bin/python', '-c', 'import matplotlib'], returncode=0, stderr='').stderr 635 FAILED lib/matplotlib/tests/test_pyplot.py::test_pylab_integration - BlockingIOError: [Errno 11] Resource temporarily unavailable 636 FAILED lib/matplotlib/tests/test_simplification.py::test_throw_rendering_complexity_exceededThe tempdir one is already skipped on nt, I assume cygwin has the same limitations and should be skipped?
Cygwin implements its own permissions: to check I just ran the code from the test locally and got the expected warning. Extrapolating from this FAQ entry, I think whether the test could work would depend on whether the underlying filesystem is FAT/FAT32 (fails) or NTFS/NFS (can pass). There's a possibility there's default ACLs on one of the parent directories (implied by a different FAQ entry).
The other possibility is that the test for root isn't working since the user ID for "Administrator" even running as administrator is not 0 by default (see this user guide description). I could try changing that using mkpasswd, or try to look up how to tell if the current session has the relevant "Access files regardless of permissions" token.
The rendering complexity one may be an actual bug that is testing that c code bails on too-big things. #21789 seems related and the most recent PR to touch this code was #19343.
I have no good theory on the pylab one.
That's a fork() failure, reasonably well-known on Cygwin. The three main recommendations are:
- stop antivirus software (probably not relevant on CI, also not something we can control on CI)
- rebase all DLLs so they have distinct memory regions to load to so they don't collide while Cygwin tries to recreate the parent memory layout in the child (done here in this PR; I can try adding
/binto be absolutely sure all files have an address and that address is found - restart the computer and hope it goes away (WIndows makes fork even harder sometimes; probably not relevant on CI)
It looks like all the rest are fork failures. There was only one before I tried to fix it, so let's see if undoing that fix gets things passing.
In case that doesn't work, is there some way to tell pytest that all instances of BlockingIOError: [Errno 11] Resource temporarily unavailable should be treated as a non-strict xfail of the relevant test on Cygwin?
That logic should probably go around the subprocess calls in the helper used in the sphinx tests + subprocess_run_helper?
Might also be worth putting wrapped versions of the subproccess functions we use an matplotlib.testing and then making sure we never directly use the stdlib functions?
The new CI run passes! Now let's see if it can do that consistently.
It looks like the remaining failures are mostly docs-related, then a compilation failure on Windows.
Should I have moved the stdint include after the Python.h include rather than moving Python.h before stdint.h?
Should I squash the first ninety or so commits? Most of them are changes to the debugging level, and one or two got merged in other PRs; it's mostly the last six or so that do other things.
On one hand, it is currently at a very nice number of commits (100 🤯), but on the other I do not think we need the full history and pushing this to one commit for the c change, one commit for the change to the tests and one commit for the GHA config is probably the right move...
We can also squash-merge.
I am inclined to squash-merge this as-is.
Seems you aren't attached to the 100-commit bit, so here's a three-commit version.
Looking at the skipped tests, several of the dependencies are available on Cygwin; I could try adding them and seeing what happens. The TeX tests are likely to run into fork failures, as are the ImageMagick animation tests and those requiring GhostScript, but the Tk tests should work fine.
Note, we do prefer rebases over merges, if you can.
It's not clear to me that this is worth the added CI load and complexity.