HDF5.jl
HDF5.jl copied to clipboard
Crash with system-provided OpenMPI and HDF5_jll v1.14
When I set up a simple project with the latest MPI and HDF5 packages and configure it to use the system-provided OpenMPI installation, the call to MPI.Init()
crashes with “orte_init failed” errors. I am observing issue on both Ubuntu 18.04 (OpenMPI 3.1.2) and 20.04 (OpenMPI 4.0.3). Downgrading to HDF5_jll v1.12 fixes the issue.
Steps to reproduce:
- create a new folder and launch Julia with
julia --project=.
- install dependencies with
]add MPI HDF5
- run
using MPI; MPI.MPIPreferences.use_system_binary()
- attempt to run
mpirun -n 4 julia --project -e "using MPI, HDF5; MPI.Init()"
(ormpiexecjl
), observe crash - downgrade with
]add [email protected]
, rerun without crash
On Ubuntu 18.04, the error includes the line mca_base_component_repository_open: unable to open mca_pmix_pmix3x: /home/user/.julia/artifacts/f9744710560ba3ddc00cd9df62ac7dfcd18c8649/lib/openmpi/mca_pmix_pmix3x.so: undefined symbol: opal_envar_t_class
, in case this is helpful.
ah, I've seen something similar! The problem appears to be that we're opening two different MPI libraries (the system one from MPI.jl, and the JLL one (from HDF5_jll).
Easy workarounds:
- use a system HDF5 (see HDF5.jl docs)
- cap HDF5_jll at 1.12 (set the compat
HDF5_jll = "~1.12"
.
In the longer term we need a better fix. @giordano @eschnett any suggestions on how we can deal with this?
I thought HDF5_jll.jl
would use the MPI library chosen by MPIPreferences.jl
Yeah, i don't quite get why it's pulling in OpenMPI_jll?
Ah, I see.
It augments based on the value of the MPI abi
:
https://github.com/JuliaBinaryWrappers/HDF5_jll.jl/blob/b96de8ada558f8d70e27b5561d4f5df815b01ebf/.pkg/platform_augmentation.jl#L13
But the augmentation for abi = "openmpi"
always loads OpenMPI_jll
:
https://github.com/JuliaBinaryWrappers/HDF5_jll.jl/blob/main/src/wrappers/x86_64-linux-gnu-libgfortran5-cxx03-mpi%2Bopenmpi.jl#L9
My approch, of course, would be to use the Julia-provided MPItrampoline
as MPI implementation, and to use the system MPI via MPItrampoline...
Would it be possible to print a warning if a system-provided MPI installation, but no system-provided HDF5 is detected?