ompi
ompi copied to clipboard
Apple linker does not accept `-commons use_dylibs` flag anymore
Background information
What version of Open MPI are you using? 5.0.2
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Installed from released sources as part of Homebrew build (https://github.com/Homebrew/homebrew-core/pull/166807)
Please describe the system on which you are running
- Operating system/version: macOS 14.4
- Computer hardware: Apple M1
- Network type: not relevant
Details of the problem
Compiling any Fortran MPI code with mpifort hellof.f90 -o hellof with Xcode 15.3 gives:
ld: warning: -commons use_dylibs is no longer supported, using error treatment instead
ld: common symbol '_mpi_fortran_argv_null_' from '/private/tmp/cclH6ubZ.o' conflicts with definition from dylib '_mpi_fortran_argv_null_' from '/opt/homebrew/Cellar/open-mpi/5.0.2_1/lib/libmpi_usempi_ignore_tkr.40.dylib'
collect2: error: ld returned 1 exit status
That is because -commons use_dylibs is now ignored (giving the warning), which leads to the symbol being rejected as defined twice.
I have reported the regression (compared to Xcode 14 and earlier linkers) to Apple as FB13194355.
similar problem I have encountered when I was trying to install super-dist package using spack and using [email protected]
ld: warning: ignoring duplicate libraries: '-lemutls_w', '-lgcc', '-lgfortran', '-lmpi', '-lmpi_mpifh', '-lmpi_usempi_ignore_t
kr', '-lmpi_usempif08', '-lquadmath'
ld: warning: -commons use_dylibs is no longer supported, using error treatment instead
ld: warning: ignoring duplicate libraries: '-lemutls_w', '-lgcc', '-lgfortran', '-lmpi', '-lmpi_mpifh', '-lmpi_usempi_ignore_t
kr', '-lmpi_usempif08', '-lquadmath'
ld: common symbol '_mpi_fortran_argv_null_' from '/private/var/folders/pd/9hc154y94k9_t_rb4lw0vcw00000gn/T/neoh/spack-stage/sp
ack-stage-superlu-dist-8.2.1-h3rdwb66k3wb4s6gjglymknnv4xor3nf/spack-build-h3rdwb6/FORTRAN/CMakeFiles/f_pddrive.dir/f_pddrive.F
90.o' conflicts with definition from dylib '_mpi_fortran_argv_null_' from '/Users/neoh/spack/opt/spack/darwin-sonoma-m2/apple-
clang-15.0.0/openmpi-5.0.2-ja66vwemf6adckixrk6njhmaglqwck6v/lib/libmpi_usempi_ignore_tkr.40.dylib'
Does this warning treated differently in earlier versions of apple-clang?
It might be worth trying LDFLAGS=-ld_classic
Not sure if this is related to this issue thougth
If this fixes the issue, all the credit should go to @jeffhammond https://twitter.com/science_dot/status/1772314603692626154
Is Open MPI getting these flags from GNU Libtool? I.e., is this actually a Libtool issue?
Is Open MPI getting these flags from GNU Libtool? I.e., is this actually a Libtool issue?
No x 2:
https://github.com/open-mpi/ompi/blob/984944d9d9f3f6eda199fe6a040d65070d3a0745/config/ompi_setup_fc.m4#L236
FYI I have verified the following works with XCode 15.3 on Sonoma 14.4, which is the workaround Apple gave me.
I also confirmed it works when gfortran is used to initiate the linker, if -Wl,-ld_classic -Wl,-commons,use_dylibs is used.
% gcc -fPIC -shared extern2.c -o libxxx.so && \
gfortran -c extern.F90 && ld extern.o libxxx.so \
-L/opt/homebrew/Cellar/gcc/13.2.0/lib/gcc/current/ -lgfortran \
-o extern -ld_classic -commons use_dylibs && \
./extern ; nm extern | grep MPI
ld: warning: -commons use_dylibs is no longer supported, using error treatment instead
MPIR_F08_MPI_IN_PLACE=0 &MPIR_F08_MPI_IN_PLACE=0x102818000 &MPIR_F08_MPI_IN_PLACE=4337008640
LOC(MPI_IN_PLACE)= 4337008640
LOC(buf)= 6134510240
sendbuf=0x102818000, sendbuf=4337008640
sendbuf is MPI_IN_PLACE? yes
recvbuf=0x16da532a0, recvbuf=6134510240
*count=1, *datatype=2, *op=3, *comm=4
911
U _MPIR_F08_MPI_IN_PLACE
U _MPI_Allreduce
// extern2.c
#include <stdio.h>
#include <stdint.h>
int MPIR_F08_MPI_IN_PLACE;
void p(void)
{
printf("MPIR_F08_MPI_IN_PLACE=%d &MPIR_F08_MPI_IN_PLACE=%p &MPIR_F08_MPI_IN_PLACE=%zu\n",
MPIR_F08_MPI_IN_PLACE, &MPIR_F08_MPI_IN_PLACE, (intptr_t)&MPIR_F08_MPI_IN_PLACE);
}
void MPI_Allreduce(void ** sendbuf, void ** recvbuf,
int * count, int * datatype,
int * op, int * comm, int * ierror)
{
printf("sendbuf=%p, sendbuf=%zu\n", sendbuf, (intptr_t)sendbuf);
printf("sendbuf is MPI_IN_PLACE? %s\n",
(intptr_t)sendbuf==(intptr_t)&MPIR_F08_MPI_IN_PLACE ? "yes" : "no");
printf("recvbuf=%p, recvbuf=%zu\n", recvbuf, (intptr_t)recvbuf);
printf("*count=%d, *datatype=%d, *op=%d, *comm=%d\n",
*count, *datatype, *op, *comm);
*ierror = 911;
}
! extern.F90
module mpi
use iso_c_binding
!type(c_ptr), bind(C,name="MPI_F_IN_PLACE") :: MPI_IN_PLACE
integer(c_int), bind(C, name="MPIR_F08_MPI_IN_PLACE"), target :: MPI_IN_PLACE
interface
subroutine p() bind(C,name="p")
end subroutine
end interface
interface
SUBROUTINE MPI_ALLREDUCE(SENDBUF, RECVBUF, COUNT, DATATYPE, OP, COMM, IERROR) &
bind(C,name="MPI_Allreduce")
use iso_c_binding
import :: MPI_IN_PLACE
!DEC$ ATTRIBUTES NO_ARG_CHECK :: sendbuf,recvbuf
!GCC$ ATTRIBUTES NO_ARG_CHECK :: sendbuf,recvbuf
!$PRAGMA IGNORE_TKR sendbuf,recvbuf
!DIR$ IGNORE_TKR sendbuf,recvbuf
!IBM* IGNORE_TKR sendbuf,recvbuf
INTEGER(kind=c_int) :: SENDBUF(*), RECVBUF(*)
INTEGER(kind=c_int) :: COUNT, DATATYPE, OP, COMM, IERROR
END SUBROUTINE MPI_ALLREDUCE
end interface
end module mpi
program main
use mpi
implicit none
real :: buf(100)
integer :: ierror
call p
buf = 17
print*,'LOC(MPI_IN_PLACE)=',LOC(MPI_IN_PLACE)
print*,'LOC(buf)=',LOC(buf)
call MPI_ALLREDUCE(MPI_IN_PLACE,buf,1,2,3,4,ierror)
print*,ierror
end program main
I am trying to do the same for Homebrew: https://github.com/Homebrew/homebrew-core/pull/166807
My original analysis was that -ld_classic was not effective anymore, because of the weird warning. But in spite of the warning, the classic linker can still be called that way.
this worked for me:
spack install superlu-dist ldflags=-ld_classic
thanks @jeffhammond @ggouaillardet my specs are Sonoma 14.2.1 and [email protected] (Xcode 15.3)
Oddly enough this worked for me:
brew install gcc-13
../configure --prefix=/opt/extlib/openmpi/5.0.2/gcc/13.2.0 \
--with-libevent=internal \
--enable-mpi1-compatibility \
--enable-static \
--enable-pmix-timing \
CC=gcc-13 CXX=g++-13 FC=gfortran-13
make clean
make -j 8
make check
sudo make install
I was unable to install open-mpi 5.0.3 with the same method.
Is Open MPI getting these flags from GNU Libtool? I.e., is this actually a Libtool issue?
No x 2:
https://github.com/open-mpi/ompi/blob/984944d9d9f3f6eda199fe6a040d65070d3a0745/config/ompi_setup_fc.m4#L236
I'm sorry for the huge delay here. Thanks for the citation of ompi_setup_fc.m4.
I have Sonoma 14.4.1 with XCode 15.3, and Homebrew gfortran
$ gfortran --version
GNU Fortran (Homebrew GCC 13.2.0) 13.2.0
But I don't see these warnings when I compile with the homebrew gfortran.
$ mpifort --showme
gfortran -I/Users/jsquyres/bogus/include -Wl,-flat_namespace -Wl,-commons,use_dylibs -I/Users/jsquyres/bogus/lib -L/Users/jsquyres/bogus/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
$ mpifort hello_usempif08.f90 -o hello
$
What is different between my setup and yours?
But I don't see these warnings when I compile with the homebrew gfortran.
We have re-enabled the "classic linker" in Homebrew gfortran at some point.
We have re-enabled the "classic linker" in Homebrew gfortran at some point.
Ah, gotcha. Is this issue moot, then? Or do we still need to investigate the use of -commons use_dylibs?
I see some comments in our code that these flags were necessary at some point, but I'm afraid I don't know/remember why they were necessary (i.e., to know if they are still necessary).
They are still necessary. They are incompatible with Apple's new linker, hence we (for now) rely on the old linker.
Ok. Given that homebrew gfortran has updated, should we close this issue?
Well, it's a workaround, not a proper fix: at some point the "classic linker" might not be supported by Apple anymore. Maybe an alternative implementation is possible?
Let me make sure I'm parsing your reply correctly:
- You saying that
-common use_dylibsis still necessary. - As such, Open MPI still needs to use these flags, and gfortran still needs to support them.
- However, this solution uses Apple's old/classic linker, which could disappear someday. Hence, a difference solution should be found.
Is that correct?
If so, can you explain / remind me why we need -common use_dylibs / what those flags do?
All of that is correct.
If so, can you explain / remind me why we need -common use_dylibs / what those flags do?
The Fortran part of open-mpi uses them for common blocks. I haven't dug more on how and why.
Ah, yes, we do use some common blocks for sentinel values (i.e., they really have to be global so that we can look for them by address, not by value):
https://github.com/open-mpi/ompi/blob/ce3742c97821ee30ff5cbefda192f3c3754eb353/ompi/include/mpif-sentinels.h#L60-L68
Also, I have tested and even if MPI wasn't using COMMON, the same linker behavior is required for Fortran module data to work properly, so one cannot argue that Apple is trying to force Fortran developers to stop using COMMON (which might be laudable in some contexts).
@fxcoudert MPI implementations have to use COMMON for these. It's necessary because of how the MPI standard defines mpif.h and is furthermore required in the MPI Fortran modules until mpif.h is deleted, because sentinels are required to be interoperable across all MPI Fortran header/module usage.
It might be possible to define MPI_ANY_SOURCE (e.g.) as module data in the MPI modules, but then sentinel detection is two branches instead of one. I have not studied this in every detail to know if it's strictly valid or not, because there are a lot of edge cases to think about (such as Fortran code that uses the COMMON sentinel passing that argument into Fortran code that uses the module interfaces).
Hello
I am running MacOS 14.5 on Apple M1, Xcode 15.4, gcc-14, g++-14 and fortran-14
I compiled open-mpi-5.0.3 :
configure --prefix=$APP_DIR/openmpi-5.0.3 FC=gfortran-14 CC=gcc-14 CXX=g++-14 -with-pmix=internal --with-libevent=internal --with-hwloc=internal
make
make install
Then when I compile a program I face similar problem:
ld: warning: -commons use_dylibs is no longer supported, using error treatment instead
ld: common symbol '_mpi_fortran_argv_null_' from '/Users/chris/Builds/gnu14/paradigm/test/CMakeFiles/pdm_t_closest_points_f.dir/pdm_t_closest_points_f.f90.o' conflicts with definition from dylib '_mpi_fortran_argv_null_' from '/Users/chris/Applications/gnu14/openmpi-5.0.0/lib/libmpi_usempif08.40.dylib'
collect2: error: ld returned 1 exit status
For anyone else running into this, I found the following ways to all work around this:
- Compile
openmpiwith--with-wrapper-fcflags=-Wl,-ld_classic - Edit
openmpi-install-prefix/share/openmpi/mpifort-wrapper-data.txtand add to"-Wl,-ld_classic"to thelinker_flags=-L${libdir}line - Set
LDFLAGS="-Wl,-ld_classic" - If using
spackto build a library that uses a brokenmpifort:spack install my-package ldflags="-Wl,-ld_classic"
The fix for this has been merged into main, v4.1.x, and v5.0.x. It will be included in the next releases of v4.1 and v5.0.
Thank you!
Great news @jsquyres! Just to be clear, this will be in >=5.0.4? I.e., it isn't being backported?
@Chrismarsh This will be available in 5.0.4
It's going to be in v4.1.7, too. We keep promising to get v4.1.7 out "someday", but there hasn't been an urgent need yet.
It will not be in any v4.0.x release -- that series is dead.