Workaround Preferences in test environment for 1.6 & 1.7
Fixes #561
@luraess can you confirm that this fixes the issue for you?
Running test with export JULIA_MPI_TEST_NPROCS=2 in branch origin/vc/propagate_preferences_to_test still results in following (erroneous?) behaviour (see #561):
Hello world, I am rank 1 of 2
Hello world, I am rank 0 of 2
Test Summary: | Pass Total
mpiexecjl | 6 6
┌ Info: Running MPI tests
│ ArrayType = Array
│ nprocs = 2
│ MPIPreferences.abi = "MPICH"
└ MPIPreferences.binary = "MPICH_jll"
One thing is however solved (don't know thanks to which change): tests no longer error on test_error.jl as previously (not reported).
Also for me running the tests with Julia v1.7.2 still picks up MPICH_jll instead of the system MPI.
What I'm doing right now to run the tests with system MPI is to activate the test environment and run MPIPreferences.use_system_binary(). Then, I replaced https://github.com/JuliaParallel/MPI.jl/blob/49dcc801a3a2cd2c2f6c219c85416e404cf98c48/test/runtests.jl#L2-L9 with
@show Base.load_path()
test_project = first(Base.load_path())
@show test_project
preferences_file = joinpath(@__DIR__, "LocalPreferences.toml")
@show preferences_file
test_preferences_file = joinpath(dirname(test_project), "LocalPreferences.toml")
@show test_preferences_file
if isfile(preferences_file) && !isfile(test_preferences_file)
@info "Copying!"
cp(preferences_file, test_preferences_file)
end
@show readdir(dirname(test_project))
(yes, I removed the VERSION <= v"1.8-" check for the reason explained above), and now MPI.jl is correctly picking up system MPI. But this is a bit too convoluted :slightly_smiling_face:
@giordano I managed to reproduce what is happening for 1.8. You still need to copy LocalPreferences.toml to test/
@staticfloat do you remember why we decided against propagating the settings from the environment in which the user calls ]test? I remember us discussing it, but I don't recall the conclusion.
Can also confirm that tests now pick up the system MPI implementation upon manually copying the LocalPreferences.toml to test/ (otherwise it fails) in the setting described in https://github.com/JuliaParallel/MPI.jl/pull/564#issuecomment-1100751384.
MPI test still don't pick-up correct MPI implementation (and fail) if run outside of the MPI.jl dev'ed or cloned repo.
luraess@superzack:~/scratch/dev/MPI-rocmaware-dev$ juliap
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.1 (2021-12-22)
_/ |\__'_|_|_|\__'_| |
|__/ |
(@v1.7) pkg> activate .
Activating new project at `/scratch-1/luraess/dev/MPI-rocmaware-dev`
(MPI-rocmaware-dev) pkg> add /scratch-1/luraess/dev/MPI.jl
Updating git-repo `/scratch-1/luraess/dev/MPI.jl`
Updating registry at `~/.julia/registries/General.toml`
Resolving package versions...
(MPI-rocmaware-dev) pkg> add MPIPreferences
Resolving package versions...
Updating `/scratch-1/luraess/dev/MPI-rocmaware-dev/Project.toml`
[3da0fdf6] + MPIPreferences v0.1.1
No Changes to `/scratch-1/luraess/dev/MPI-rocmaware-dev/Manifest.toml`
(MPI-rocmaware-dev) pkg> st
Status `/scratch-1/luraess/dev/MPI-rocmaware-dev/Project.toml`
[da04e1cc] MPI v0.20.0-dev `/scratch-1/luraess/dev/MPI.jl#lr/rocmaware-dev`
[3da0fdf6] MPIPreferences v0.1.1
julia> using MPI
julia> MPI.MPI_LIBRARY_VERSION_STRING
"MPICH Version:\t4.0.2\nMPICH Release date:\tThu Apr 7 12:34:45 CDT 2022\nMPICH ABI:\t14:2:2\nMPICH Device:\tch3:nemesis\nMPICH configure:\t--prefix=/workspace/destdir --build=x86_64-linux-musl --host=x86_64-linux-gnu --enable-shared=yes --enable-static=no --with-device=ch3 --disable-dependency-tracking --enable-fast=all,O3 --docdir=/tmp --disable-opencl\nMPICH CC:\tcc -DNDEBUG -DNVALGRIND -O3\nMPICH CXX:\tc++ -DNDEBUG -DNVALGRIND -O3\nMPICH F77:\tgfortran -O3\nMPICH FC:\tgfortran -O3\n"
julia> MPI.use_system_binary()
┌ Info: MPI implementation
│ libmpi = "libmpi"
│ version_string = "Open MPI v4.1.2, package: Open MPI luraess@superzack Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021\0"
│ impl = "OpenMPI"
│ version = v"4.1.2"
└ abi = "OpenMPI"
┌ Warning: The underlying MPI implementation has changed. You will need to restart Julia for this change to take effect
│ libmpi = "libmpi"
│ abi = "OpenMPI"
│ mpiexec = "mpiexec"
└ @ MPIPreferences ~/.julia/packages/MPIPreferences/uArzO/src/MPIPreferences.jl:119
shell> ls
LocalPreferences.toml Manifest.toml Project.toml
shell> cat Project.toml
[deps]
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
MPIPreferences = "3da0fdf6-3ccc-4f1b-acd9-58baa6c99267"
Julia> exit()
luraess@superzack:~/scratch/dev/MPI-rocmaware-dev$ juliap
_
_ _ _(_)_ | Documentation: https://docs.julialang.org
(_) | (_) (_) |
_ _ _| |_ __ _ | Type "?" for help, "]?" for Pkg help.
| | | | | | |/ _` | |
| | |_| | | | (_| | | Version 1.7.1 (2021-12-22)
_/ |\__'_|_|_|\__'_| |
|__/ |
julia> using MPI
julia> MPI.MPI_LIBRARY_VERSION_STRING
"Open MPI v4.1.2, package: Open MPI luraess@superzack Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021\0"
(MPI-rocmaware-dev) pkg> test MPI
Testing MPI
[...]
Testing Running tests...
┌ Info: Running MPI tests
│ ArrayType = Array
│ nprocs = 2
│ MPIPreferences.abi = "MPICH"
└ MPIPreferences.binary = "MPICH_jll"
But this doesn't look like you copied the LocalPreferences.toml into test/? I agree that this is bad UX, but we might have to live with this since this needs to change in Pkg.jl
Sorry, for not having specified that I am using the local MPI.jl project where the LocalPreferences.toml was copied to test/
shell> pwd
/home/luraess/scratch/dev/MPI-rocmaware-dev
shell> ls /home/luraess/scratch/dev/MPI.jl/test/
LocalPreferences.toml
[...]
Can you add some debug print statements to the block that is copying the LocalPreferences.toml into the test temporary project?
To see what it is locating and trying to do? especially add something to the cp line, if it isn't copying there is no chance it picks up the right thing.
Uh, I did copy manually the LocalPreferences.toml into test/ and am on my (up-to-date) fork of MPI.jl#master having added https://github.com/JuliaParallel/MPI.jl/blob/41121ad3eb4c2f986246d6872154586f142c7022/test/runtests.jl#L1-L9 from vc/propagate_preferences_to_test branch to my runtests.jlscript.
@vchuravy is this ready to merge?
@vchuravy is this ready to merge?
Closing this, as we have a workaround for 1.6 & 1.7, and it works correctly on 1.8