meson
meson copied to clipboard
Improve MPI detection
Fixes #7045, #9637 and #13615. See also the old PR #7373.
This PR improves support for MPI detection on Unix, which is currently quite broken. I didn't try to study what happens on Windows.
-
Currently, Meson only supports IntelMPI with Intel compilers and OpenMPI with other compilers. MPICH is not supported at all. (see #7045 and #13615)
-
Currently, pkg-config gets priority over detecting mpicc/mpiicc (and friends), which can lead to very unexpected results because only OpenMPI supports pkg-config and all implementations (at least on Unix) state that the correct way to detect/use MPI is to use mpicc/mpiicc (and friends). Then if the system pkg-config gives a positive result, Meson detects OpenMPI even if
PATHhas been correctly modified such thatmpiccindicates something else. -
Environment variables
I_MPI_CCare wrongly considered whereas they are internal to IntelMPI (see #9637).
This PR improves the situation. The detection considers (in this order):
- The standard environment variables
MPICC(and friends) - For Intel compilers, the commands
mpiicc(and friends) - The commands
mpicc(and friends) - Finally,
pkg-config.
OpenMPI, MPICH, IntelMPI and other compatible implementations should be supported with different compilers. In particular OpenMPI compiled with Intel compilers and IntelMPI compiled with GCC.
MPI can be a bit tricky and there are a lot of cases to consider. It would be interesting to get the point of view of people using MPI on different clusters. CC @RemiLacroix-IDRIS, @scivision, @rcoacci, @nordmoen, @acroucher.
Note : The output of mpicc -v is not standardized. I tried to support what I obtain from different implementations but there might be other cases. In particular, I removed a line with v = re.search(r'(\d{4}) Update (\d)', out).
Note : The output of
mpicc -vis not standardized. I tried to support what I obtain from different implementations but there might be other cases. In particular, I removed a line withv = re.search(r'(\d{4}) Update (\d)', out).
I think that's needed for older version of Intel MPI, for example here is the output I get:
$ mpiifort -v
mpiifort for the Intel(R) MPI Library 2019 Update 9 for Linux*
Copyright 2003-2020, Intel Corporation.
ifort version 19.1.3.304
I think that's needed for older version of Intel MPI.
Thanks. This output should be supported now.
Thanks for this! I just installed your branch in a venv and tried building a minimal Fortran program that links to MPI, on an Ubuntu 22.04 machine with mpich installed. Unfortunately it didn't seem to detect MPI - not sure how much testing you've done with Fortran? (It's also possible I'm doing something in an outdated way, as I've been stuck with using Meson 0.53 since mpich detection was broken.)
Here is the program and the meson build file:
program foo
use mpi
implicit none
write(*,*) 'foo!'
end program foo
project('foo', 'fortran', version: '0.0.1')
mpi = dependency('mpi', language: 'fortran')
foo = executable('foo', ['foo.F90'], dependencies: [mpi])
and here is the relevant part of the meson-log.txt:
Found pkg-config: YES (/usr/bin/pkg-config) 0.29.2
Determining dependency 'ompi-fort' with pkg-config executable '/usr/bin/pkg-config'
env[PKG_CONFIG_PATH]: /home/acro018/lib/pkgconfig
env[PKG_CONFIG]: /usr/bin/pkg-config
-----------
Called: `/usr/bin/pkg-config --modversion ompi-fort` -> 1
stderr:
Package ompi-fort was not found in the pkg-config search path.
Perhaps you should add the directory containing `ompi-fort.pc'
to the PKG_CONFIG_PATH environment variable
No package 'ompi-fort' found
-----------
mpifort binary missing from cross or native file, or env var undefined.
Trying a default mpifort fallback at mpifort
Trying a default mpifort fallback at mpif90
Trying a default mpifort fallback at mpif77
mpifort found: NO
Run-time dependency MPI for fortran found: NO (tried pkgconfig, config-tool and system)
meson.build:3:6: ERROR: Dependency "mpi" not found, tried pkgconfig, config-tool and system
Here is the output from mpicc --version:
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
and mpifort --version:
GNU Fortran (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Thanks a lot for the testing @acroucher. I didn't try Fortran so it's not too surprising that it fails. I should be able to try this and hopefully fix this issue.
@acroucher can you recheck that you really use the branch paugier:mpi-detection ? From your log, I see that pkg-config is used first, which shouldn't be the case with my branch.
I tried your small example in a conda environment created with conda create -n env-mpich-fortran mpich fortran-compiler and MPI is correctly detected here.
mpifort found: YES (/data/mambaforge/envs/env-mpich-fortran/bin/mpifort) 4.2.2
Run-time dependency MPI for fortran found: YES 4.2.2
Can you please also provide the output of
mpifort -show
# and
mpifort -v # (only first line)
can you recheck that you really use the branch paugier:mpi-detection ?
Doh! you're right, I forgot to check out the branch. I ran it again on the correct branch and it works fine. Brilliant! I also tested it on some real code and that worked too. Thank you!
Can you please also provide the output of
And here are the outputs from mpifort -show and mpifort -v:
mpifort -show
gfortran -O2 -ffile-prefix-map=/build/mpich-0xgrG5/mpich-4.0=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -fallow-invalid-boz -fallow-argument-mismatch -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -I/usr/include/x86_64-linux-gnu/mpich -I/usr/include/x86_64-linux-gnu/mpich -L/usr/lib/x86_64-linux-gnu -lmpichfort -lmpich
mpifort -v
mpifort for MPICH version 4.0
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.4.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-XeT9lY/gcc-11-11.4.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)
I did the history edition suggested. I hope I did it correctly. I get:
@ changeset: 16979:3dc06bdb03d3
| bookmark: mpi-detection
| tag: default/mpi-detection
| tag: tip
| user: paugier <[email protected]>
| date: Mon Sep 02 17:46:08 2024 +0200
| summary: MPI detection: get version from old IntelMPI wrappers
|
o changeset: 16978:025928ebca77
| parent: 16975:5d46b2165d8f
| user: paugier <[email protected]>
| date: Fri Aug 30 17:29:10 2024 +0200
| summary: MPI detection: support more implementations (any compilers)
|
o changeset: 16975:5d46b2165d8f
| user: paugier <[email protected]>
| date: Wed Sep 04 23:15:56 2024 +0200
| summary: MPI detection: mpicc/mpiicc before pkg-config
|
o changeset: 16974:c34a0170dda1
| bookmark: master
| tag: upstream/master
| user: Dylan Baker <[email protected]>
| date: Mon Aug 19 14:29:44 2024 -0700
| summary: mformat: provide nice error message instead of backtrace for invalid value
which is quite reasonable (except maybe the dates). However, it seems that Github sorts the commits by date, which is a bit wrong here.
which is quite reasonable (except maybe the dates). However, it seems that Github sorts the commits by date, which is a bit wrong here.
That looks like mercurial output, so perhaps it's about the export format. In git on the command-line they are ordered correctly, but I also notice that the mercurial "date" field is mapped to both "AuthorDate" and "CommitDate" in git. Usually when applying/reordering commits in git, the CommitDate is the date that you performed the reordering, and I am pretty sure that's what github uses to order its history flow since it also wants to interweave commits in between the comments surrounding when the commits were made.
Weird UX issue, I guess. :) Works locally though.
Finding MPI is quite a task. For reference:
- CMake FindMPI.cmake seems to just work
- another attempt at FindMPI.cmake
Finding MPI is quite a task.
Indeed! It seems to me that it is reasonable to progress by steps. Especially because it is very difficult (or impossible) to test the different possibilities.
The first step (this PR) is to fix Meson for the very common cases (in particular MPICH, OpenMPI with Intel compiler and IntelMPI with GCC).
At least, when Meson gets a good support for the common cases, more projects relying on MPI will be able to use Meson and we can expect other bug reports that will allow us to support more cases.
A first next step after this PR would be to look at what happens on Windows.
@eli-schwartz, is there something else I need to do on this PR? It seems to me that it really improves the situation.
@eli-schwartz gentle ping. Can I get another feedback on this PR? Is there an issue? Should I do something more?
Other than the loop thing, this looks super reasonable to me.
I updated the PR. @eli-schwartz and @dcbaker should I have to do something more?
Fixes #7045, #9637 and #13615. See also the old PR #7373.
Aside: github unfortunately doesn't actually figure this information out on its own. It requires you to repeat the "fixes" word for each ticket if you want merging this PR to automatically close the relevant issues.
I changed the description of this PR to automatically close the relevant issues.
I rebased to fix a lint issue (https://github.com/mesonbuild/meson/pull/13695).
Thank you for persevering. :)
Thanks @eli-schwartz and @dcbaker for your reviews.
Now I like to use this on conda-forge, Spack and Guix but I will have to wait for a new Meson release.
I see that Meson 1.5.2 has just been released few days ago (https://pypi.org/project/meson/#history). Do you think we could have a new release (1.5.3 ?) in few days or weeks?
I realize that pushing a new version on PyPI for Meson might not be as straightforward as for simpler and less popular projects. So I'm just respectfully asking...
Unfortunately I don't think we can backport it at all. It changes the "shape" of the pickled coredata which means that existing build directories configured with meson 1.5.2 would be binary incompatible with meson 1.5.3, leading to crashes. A new major.minor release forces a full from-scratch reconfigure instead and doesn't load e.g. things like cached dependency lookups.
It's a good question though, in general, because I'm always happy to consider nominating a patch for backporting, and I'd like to get us into the situation where we issue new bugfix releases once every couple of weeks in the event that there are patches that have been nominated
Oh that's bad news. If I understand correctly, it means that these fixes will be only included in Meson 1.6.0, which could be available in something like few months if I estimate from the release history.
Without a release on PyPI containing these fixes, it seems to me that it's impossible to test with conda-forge and Spack.
In practice, it means that Python projects using Meson and MPI cannot be installed on clusters using MPICH with standard install procedures (one would need to install by hands meson from source + all build dependencies and to run pip install --no-build-isolation ...).
The next question is then do you have an idea when Meson 1.6.0 could be released? If it is a matter of approximately one month I would wait. However, if it could be few months, I would have to switch back few projects to use another backend (I guess setuptools). Note that it would not be a huge issue, just a bit of not very useful and interesting work.
Considering our usual release cadence, meson 1.6.0rc1 should probably be released within the month.
It's also possible to test with conda-forge and spack, if you make a VCS package available for meson. For example, in Gentoo these are called "live" ebuilds, you install "meson==9999" and it always fetches the latest code from git master. In Arch, the same thing is called "meson-git==1.5.0.r214.g1aac6cc1e"
You can also specify a dependency in pyproject.toml as:
meson @ git+https://github.com/mesonbuild/mesonmeson @ git+https://github.com/mesonbuild/meson@sha1
if you want to guarantee that pip install and build isolation pulls in a version of meson that you know has all the features you want. pip install will anyways not preserve build directories by default, so no incremental builds of an existing worktree and therefore no worry about binary coredata compatibility.
But we really should be having a new release (candidate) within the month. The previous release was July 10, it's been 2.5 months since then, and we try to have a new release once every 3 or 4 months, which means we should have one anywhere from 2 weeks to 1.5 months from now -- and if we assume the outer limit of 1.5 months, we still need to put out release candidates at least 3 weeks ahead of time.
@paugier,
the plan is to tag the first release candidate next Sunday.
@paugier,
There is an rc1 released today and available for installation via PyPI. I packaged it in Gentoo for the benefit of people doing prerelease testing, but I don't know the policies of conda or spack around that sort of thing.
Please test this prerelease if you can, to help ensure the release occurs as smoothly as possible.
We hope to release the final 1.6.0 release in one week's time. Alas, life finds a way, and on average it's common to find at least one regression serious enough to make a second release candidate to allow people to test that fix; if this should happen, we expect the final release to happen in two week's time instead.
@eli-schwartz
I'm finally trying 1.6.0.rc1 and ~~it seems that there is a problem~~. I'm trying to understand.
It was my mistake. Just for the record, a "simple" way to check:
conda create -n env-mpich mpich cxx-compiler python=3.12 -y
conda activate env-mpich
conda install mpi4py fftw=*=mpi* fluidfft pkg-config meson-python cython fluidfft-builder -y
pip install meson --pre -U
pip install fluidfft-fftwmpi --no-build-isolation -v
which gives:
mpic++ found: YES (/home/pierre/mambaforge/envs/env-mpich/bin/mpic++) 4.2.3
mpich can be replaced by openmpi or impi-devel.