yt
yt copied to clipboard
OSX wheels aren't compiled with OpenMP support
Bug report
Bug summary
OSX wheels aren't compiled with OpenMP support (as can be read from the log of the wheels compilations). The default clang
compiler on OSX doesn't seem to support OpenMP compilation as far as I could tell (but I do not have an OSX machine to confirm/infirm), so extending the documentation to compile with OpenMP support would be useful. In any case, I've had users report that installing from source does not compile with OpenMP support.
Code for reproduction
- Operating System: OSX on x64 and arm
- Python Version: any
- yt version: any
- Other Libraries (if applicable): OpenMP
Ugh, so that's why that error looked familiar! I'm using an M1 mac, but somehow yt is working. I'm not specifically setting anything to use OpenMP there though, since I'm only running tiny test cases.
I can confirm that Mac's default compiler, clang
, does not support OpenMP. Worse, gcc
on the command line is aliased to clang
, at least by default. (I have not tried turning that off, so I don't know if it's possible.) I have previously been able to compile C code with OpenMP on my mac, but that was by (1) installing gcc e.g., from homebrew, and (2) compiling C code by calling the specific gcc version, e.g., gcc-13
instead of gcc
. This was, however, pure C code that I then called from python with ctypes
, and I think I might actually have had to give up getting that to work on the M1 mac. (The 'clang
not supporting OpenMP' issue is not new.)
I'm not sure if there's a way to get python to look up if there's a real gcc
version on a system and to use that instead. Overall, I suppose apple had some reason for aliasing gcc
to clang
, but it's a real pain when it doesn't actually support some of gcc
's features.
... so I checked, and I can confirm that although the C code compiles (and I can run a test C program from the command line), I get a similar issue to the one Jack reported if I try to call the .so
file from python:
OSError: dlopen(/Users/nastasha/code/proj-an-c/interp2d/interp.so, 0x0006): tried: '/Users/nastasha/code/proj-an-c/interp2d/interp.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64')), '/System/Volumes/Preboot/Cryptexes/OS/Users/nastasha/code/proj-an-c/interp2d/interp.so' (no such file), '/Users/nastasha/code/proj-an-c/interp2d/interp.so' (mach-o file, but is an incompatible architecture (have 'arm64', need 'x86_64'))
The .so
file was compiled with gcc-13
, specifically Homebrew GCC 13.2.0
. By the way, I had to deactivate conda
before it would even compile due to some linking library issue, but I did activate conda
again to run from python.
Overall, this issue seems to be a general headache on Macs. This stackexchange thread seems to have some ideas: https://stackoverflow.com/questions/28010801/compiling-parallelized-cython-with-clang , but it requires installing libraries on your own computer first, and adding a bunch of stuff on the command line. I don't get the impression that there's a straightforward way to get this to work 'off the shelf'.
Honestly, my own 'solution' has been to do small bits of C-from-python development on the login nodes of the university linux cluster. I run all my production analysis on linux-managed clusters anyway.
I'm using an M1 mac, but somehow yt is working. I'm not specifically setting anything to use OpenMP there though, since I'm only running tiny test cases.
Building for mac arm64 without OpenMP is definitely something we've been exercising (and we've been publishing wheels for it since yt 4.0.4), so as long as you're not trying to enable it, yt is expected to build and run correctly on this arch (albeit at sub-optimal performance).
Ugh, yeah this mostly makes me wish I trusted myself to maintain a linux system
I dug a little bit (with @cphyc's help) and found a reasonably painless way to build yt with OpenMP on this platform
# test_omp.sh
set -euxo pipefail
brew install gcc
export CXX=g++-13
export CC=gcc-13
rm -fr .venv | true
python -m venv .venv
source .venv/bin/activate
python -m pip install build
git clone https://github.com/yt-project/ewah_bool_utils.git _ewah_bool_utils
pushd _ewah_bool_utils
rm -fr dist | true
python -m build --wheel
python -m pip install dist/*.whl
popd
git clone https://github.com/yt-project/yt.git _yt
pushd _yt
rm -fr dist | true
python -m build --wheel
python -m pip install dist/*.whl
popd
python -m pip install pandas h5py pooch
OMP_NUM_THREADS=4 python t.py
OMP_NUM_THREADS=2 python t.py
# t.py
import yt
from time import monotonic_ns
import os
from tqdm import tqdm
ds = yt.load_sample("output_00080")
NREP=10
tstart = monotonic_ns()
for i in tqdm(range(NREP)):
p = yt.ProjectionPlot(ds, [1, 1, 1], ("gas", "density"))
p.render()
tstop = monotonic_ns()
dt = (tstop-tstart) / 1e9 # in s
print(f"Took {dt:.1e} s ({dt/NREP:.1e} s/it)", end="")
if (omp_num_threads:=os.environ.get("OMP_NUM_THREADS")) is not None:
print(f" using {omp_num_threads} OpenMP threads")
else:
print()
However, because this technique involves dynamically linking libgomp
I got from homebrew (/opt/homebrew/opt/gcc/lib/gcc/current/libgomp.1.dylib
), the resulting wheel isn't portable, so we cannot apply this to the publishing process.
As noted by @cphyc, portability may be addressable on the conda-forge side (if we're not doing it already).
Meanwhile, we could document this technique, but we need to know whether this is also an issue with conda-forge binaries first.