awkward icon indicating copy to clipboard operation
awkward copied to clipboard

chore: update matrix to macOS-latest

Open ianna opened this issue 1 year ago • 14 comments

The macOS-11 environment is deprecated and will be removed on June 28th, 2024. Currently, all tests on MacOS-11 are cancelled:

Run Tests (macos-11, 3.8, x64, full)
This is a scheduled macOS-11 brownout. The macOS-11 environment is deprecated and will be removed on June 28th, 2024.
Run Tests (macos-11, 3.11, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.9, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.10, x64, full)
GitHub Actions has encountered an internal error when running your job.
Run Tests (macos-11, 3.12, x64, full)
This is a scheduled macOS-11 brownout. The macOS-11 environment is deprecated and will be removed on June 28th, 2024.
Run Tests (macos-11, 3.12, x64, full)
GitHub Actions has encountered an internal error when running your job.

ianna avatar Jun 24 '24 12:06 ianna

Updating to macOS-latest:

  • with either Python 3.11 or 3.12 the following test segfaults on avro_reader:

File "/Users/runner/work/awkward/awkward/tests/test_1345_avro_reader.py", line 19 in test_int

  • CI fails to install Python 3.8, 3.9, and 3.10 because of /usr/local/opt/gettext/lib/libintl.8.dylib:

Installed versions
  Version ~3.8.0-0 was not found in the local cache
  Version ~3.8.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.8.18-9599280229/python-3.8.18-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/50616bfe-b731-46a1-a80b-aa4f625178eb -f /Users/runner/work/_temp/3c64bc7c-fd38-489e-9e10-f25640cb8b27
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.8.18 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[2322]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <76EC6AAE-B1A7-382D-B14F-55446445181E> /Users/runner/hostedtoolcache/Python/3.8.18/x64/bin/python3.8
    Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  Error: ./setup.sh: line 54:  2322 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134

Installed versions
  Version ~3.9.0-0 was not found in the local cache
  Version ~3.9.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.9.19-9599861319/python-3.9.19-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/18f35be1-b42d-4514-a112-6f7bc53cc3f0 -f /Users/runner/work/_temp/d68c0f92-cb89-47e6-a09f-06a471be2d67
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.9.19 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[2312]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <64474517-EFC0-32F5-93D6-1C4BAE8783F9> /Users/runner/hostedtoolcache/Python/3.9.19/x64/bin/python3.9
    Reason:
  Error: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  ./setup.sh: line 54:  2312 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134
Installed versions
  Version ~3.10.0-0 was not found in the local cache
  Version ~3.10.0-0 is available for downloading
  Download from "https://github.com/actions/python-versions/releases/download/3.10.14-9599980810/python-3.10.14-darwin-x64.tar.gz"
  Extract downloaded archive
  /usr/bin/tar xz -C /Users/runner/work/_temp/83746de3-e758-41ea-b109-892583cdb238 -f /Users/runner/work/_temp/4b092a0a-dd2e-410e-95ca-72eb7a5eb343
  Execute installation script
  Check if Python hostedtoolcache folder exist...
  Create Python 3.10.14 folder
  Copy Python binaries to hostedtoolcache folder
  Create additional symlinks (Required for the UsePythonVersion Azure Pipelines task and the setup-python GitHub Action)
  Upgrading pip...
  Error: dyld[1955]: Library not loaded: /usr/local/opt/gettext/lib/libintl.8.dylib
    Referenced from: <09857011-94D0-3FBA-9F9D-9FCE0E7366FF> /Users/
  Error: runner/hostedtoolcache/Python/3.10.14/x64/bin/python3.10
    Reason: tried: '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/System/Volumes/Preboot/Cryptexes/OS/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/opt/gettext/lib/libintl.8.dylib' (no such file), '/usr/local/lib/libintl.8.dylib' (no such file), '/usr/lib/libintl.8.dylib' (no such file, not in dyld cache)
  ./setup.sh: line 54:  1955 Abort trap: 6           ./python -m ensurepip
  Error: The process '/bin/bash' failed with exit code 134

ianna avatar Jun 24 '24 13:06 ianna

[159/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthInputBuffer.cpp.o
[160/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/builder/UnionBuilder.cpp.o
[161/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/util.cpp.o
[162/172] Building CXX object CMakeFiles/awkward-cpu-kernels.dir/src/cpu-kernels/awkward_sort.cpp.o
[163/172] Linking CXX shared library libawkward-cpu-kernels.dylib
[164/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthOutputBuffer.cpp.o
...
[168/172] Building CXX object CMakeFiles/_ext.dir/src/python/io.cpp.o
[169/172] Building CXX object CMakeFiles/awkward.dir/src/libawkward/forth/ForthMachine.cpp.o
[170/172] Linking CXX shared library libawkward.dylib
[171/172] Building CXX object CMakeFiles/_ext.dir/src/python/forth.cpp.o
[172/172] Linking CXX shared module _ext.cpython-38-darwin.so
ld: warning: -undefined dynamic_lookup may not work with chained fixups

ianna avatar Jun 25 '24 14:06 ianna

By updating from macos-11, you're seeing the issue that Angus punted on in https://github.com/scikit-hep/awkward/pull/2869/commits/0d83c86d0c9b3bc414e4b35ffb2758691154dffa.

@henryiii, there was a time several months ago when GitHub Actions made a big jump in MacOS version, and some things were broken as a result. I think this was one of them: the linker wasn't right, and therefore libawkward.dylib symbols didn't get properly linked into _ext.*.dylib. (The Avro failure is just the canary in the coalmine: it's the first use of the _ext extension module, through AwkwardForth.) Is this familiar? Do you know what's happening here?

jpivarski avatar Jun 25 '24 15:06 jpivarski

By updating from macos-11, you're seeing the issue that Angus punted on in 0d83c86.

@henryiii, there was a time several months ago when GitHub Actions made a big jump in MacOS version, and some things were broken as a result. I think this was one of them: the linker wasn't right, and therefore libawkward.dylib symbols didn't get properly linked into _ext.*.dylib. (The Avro failure is just the canary in the coalmine: it's the first use of the _ext extension module, through AwkwardForth.) Is this familiar? Do you know what's happening here?

Thanks! I thought it looked familiar :-)

Yes, it's a linker issue. The -undefined dynamic_lookup flag allows the linker to defer symbol resolution until runtime, which can be problematic if the required symbols are not found. This is often used in Python extensions to allow them to be dynamically loaded. Chained fixups are a newer feature in macOS that optimize the way dynamic libraries are loaded. However, they may not always work correctly with -undefined dynamic_lookup.

Unfortunately, my laptop is MacOS 11.6 and I'm planning to upgrade it to 12 this weekend so that I could test if we need to explicitly export the symbols and check if the CMakeLists.txt is set up correctly to handle symbol visibility and dynamic linking.

ianna avatar Jun 25 '24 15:06 ianna

My MacOS is 14.5 and I just ran another installation: no compilation issues, linker issues, or dynamic loading issues. All of the tests pass.

(Major versions 11, 12, and 14 seem pretty far apart. I checked on endoflife.date/macos and MacOS 11 was dropped by Apple last September.)

I think we were using an old Mac version because it works on all versions from the version used for compilation onward, and we're therefore covering all versions that are still in service. I don't know why it would fail to compile on GitHub's MacOS 11 and not on my MacOS 14.

jpivarski avatar Jun 25 '24 15:06 jpivarski

Apple was trying to move away from -undefined dynamic_lookup, but it was integral to how modules for languages like Python worked, so it's still valid, AFAIK, and I believe it just disables chained fixups. Older compilers may throw warnings and not work as well. There is a way to do the chained fixups for Python extensions, but it's involved (you have to process the Python binary and build a file with a symbol table, I think) and isn't something we've ever added to pybind11's CMake infrastructure. Wenzel does have it in nanobind's CMake code, so it is possible.

macos-latest (and macOS-14) are ARM, while macos-13 and before is Intel, on GHA.

There are some issues with newer macOS versions supporting older ones around AVX instructions, I think, but it's not a linker issue, I think it was just a bug with the newest compilers at one point.

henryiii avatar Jun 25 '24 17:06 henryiii

Have you tried macos-13?

henryiii avatar Jun 25 '24 17:06 henryiii

Have you tried macos-13?

I have tried the latest that was defaulted to 13.

ianna avatar Jun 25 '24 19:06 ianna

latest is 14 now, has been for a few weeks. That would be much faster, but is also a bigger change (Apple Silicon). 13 seems to segfault.

henryiii avatar Jun 25 '24 19:06 henryiii

latest is 14 now, has been for a few weeks. That would be much faster, but is also a bigger change (Apple Silicon). 13 seems to segfault.

See my comments above. I think, it was 13 😀

ianna avatar Jun 25 '24 19:06 ianna

@jpivarski and @henryiii - the problem was an architecture mismatch - we requested x86, but the macos-latest nodes are arm64, so the actions were downloading incompatible python libraries. It was masked by the fact that the gettext location (as installed by homebrew) is different on the newer architectures. It looks like the actions did not use the environment variables I tried to define. The architecture error became apparent only after a link to the expected location was added.

The remaining problem is the avro test segfault and that may also be related the wrong architecture (because all runs well on Jim's laptop ;-)

The bottom line is that we should go for macos-latest. As I understand from this GitHub blog macos-14 becomes macos-latest together with macos-11 retirement - expected to complete by June 2024.

ianna avatar Jun 26 '24 12:06 ianna

macos-12 and macos-13 are Intel. And the transition of macos-latest to macos-14 (Apple Silicon) was completed a couple of weeks ago.

henryiii avatar Jun 26 '24 14:06 henryiii

Let's go with macos-latest.

Does this mean that we're not building awkward-cpp for Intel Macs anymore? That would only be okay if they're not supported by Apple. (People with Intel Macs would have to use old versions of Awkward Array, but they're on unsupported hardware, so what can we do?)

jpivarski avatar Jun 26 '24 16:06 jpivarski

We should still be building them with cibuildwheel. I think this is only the native testing. And they are still supported by Apple, the latest operating systems are being released for Intel, minus a few features.

henryiii avatar Jun 26 '24 17:06 henryiii

@henryiii - I think, I'm still missing something. Could you, please have a look? Thanks!

ianna avatar Jul 01 '24 13:07 ianna

@henryiii - I think, I'm still missing something. Could you, please have a look? Thanks!

ianna avatar Jul 03 '24 15:07 ianna

Test tests/test_1345_avro_reader.py failing on macOS 14 (arm64)

To follow-up, @ianna and I worked on reproducing the issue on macOS independently of the GitHub action, and we were able to reproduce the seg fault.

System details:

$ sw_vers 
ProductName:		macOS
ProductVersion:		14.5
BuildVersion:		23F79

$ uname -a
uname -a
Darwin [...] 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103 arm64

After building in a Python 3.1o env. following the instructions^1, we can confirm that the test test_1345_avro_reader is failing.

$ python -m pytest tests/test_1320_mask_identity_defaults.py
[...]
collected 1 item                                                                                                                                                                                          

tests/test_1320_mask_identity_defaults.py .                                                                                                                                                         [100%]
$ python -m pytest tests/test_1345_avro_reader.py            
=========================================================================================== test session starts ===========================================================================================
platform darwin -- Python 3.10.11, pytest-8.2.2, pluggy-1.5.0
rootdir: /path/to/awkward
configfile: pyproject.toml
collected 20 items                                                                                                                                                                                        

tests/test_1345_avro_reader.py Fatal Python error: Segmentation fault

Current thread 0x00000001f755cc00 (most recent call first):
  File "/path/to/awkward/src/awkward/_connect/avro.py", line 81 in __init__
  File "/path/to/awkward/src/awkward/operations/ak_from_avro_file.py", line 48 in from_avro_file
  File "/path/to/awkward/src/awkward/_dispatch.py", line 39 in dispatch
  File "/path/to/awkward/tests/test_1345_avro_reader.py", line 19 in test_int
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/python.py", line 162 in pytest_pyfunc_call
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/python.py", line 1632 in runtest
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 173 in pytest_runtest_call
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 241 in <lambda>
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 341 in from_call
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 240 in call_and_report
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 135 in runtestprotocol
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/runner.py", line 116 in pytest_runtest_protocol
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/main.py", line 364 in pytest_runtestloop
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/main.py", line 339 in _main
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/main.py", line 285 in wrap_session
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/main.py", line 332 in pytest_cmdline_main
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_callers.py", line 103 in _multicall
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pluggy/_hooks.py", line 513 in __call__
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/config/__init__.py", line 178 in main
  File "/path/to/awkward-venv/lib/python3.10/site-packages/_pytest/config/__init__.py", line 206 in console_main
  File "/path/to/awkward-venv/lib/python3.10/site-packages/pytest/__main__.py", line 7 in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86 in _run_code
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196 in _run_module_as_main

Extension modules: numpy._core._multiarray_umath, numpy._core._multiarray_tests, numpy.linalg._umath_linalg (total: 3)

jcfr avatar Jul 13 '24 22:07 jcfr

I haven't been able to reproduce. Tried on Intel and AS. I had old CLT (11!), so was finally able to get it to see an update by touching /tmp/.com.apple.dt.CommandLineTools.installondemand.in-progress. But even with 15, I still don't get a segfault here. Running:

rm -r awkward-cpp/build
git submodule update --init --recursive
nox -s prepare
uv venv
. .venv/bin/activate.fish
uv pip install pip pytest
pip install -v ./awkward-cpp
pip install -e .
pytest tests/test_1345_avro_reader.py
$ sw_vers
ProductName:		macOS
ProductVersion:		14.5
BuildVersion:		23F79
$ uname -a
Darwin [...] 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:16:51 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T8103 arm64
$ clang --version
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: arm64-apple-darwin23.5.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
$ sw_vers
ProductName:		macOS
ProductVersion:		14.4.1
BuildVersion:		23E224
$ uname -a
Darwin [...] Darwin Kernel Version 23.4.0: Fri Mar 15 00:11:05 PDT 2024; root:xnu-10063.101.17~1/RELEASE_X86_64 x86_64
$ clang --version
Apple clang version 15.0.0 (clang-1500.3.9.4)
Target: x86_64-apple-darwin23.4.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

henryiii avatar Jul 18 '24 16:07 henryiii

Thanks to @henryiii who fixed the build on MacOS-13.

ianna avatar Jul 19 '24 15:07 ianna

For future reference, was the issue effectively fixed by the following commit and pull request ?

  • https://github.com/scikit-hep/awkward/commit/0eff78cfc3f0c2f789d556029f12ed669c16d880 / https://github.com/scikit-hep/awkward/pull/3167

You may also want to update the issue linked above (https://github.com/scikit-hep/awkward/issues/3181) to indicate it was fixed by that same pull request

jcfr avatar Jul 19 '24 17:07 jcfr