MPI.jl icon indicating copy to clipboard operation
MPI.jl copied to clipboard

Workaround Preferences in test environment for 1.6 & 1.7

Open vchuravy opened this issue 3 years ago • 13 comments

Fixes #561

vchuravy avatar Apr 16 '22 13:04 vchuravy

@luraess can you confirm that this fixes the issue for you?

vchuravy avatar Apr 16 '22 15:04 vchuravy

Running test with export JULIA_MPI_TEST_NPROCS=2 in branch origin/vc/propagate_preferences_to_test still results in following (erroneous?) behaviour (see #561):

Hello world, I am rank 1 of 2
Hello world, I am rank 0 of 2
Test Summary: | Pass  Total
mpiexecjl     |    6      6
┌ Info: Running MPI tests
│   ArrayType = Array
│   nprocs = 2
│   MPIPreferences.abi = "MPICH"
└   MPIPreferences.binary = "MPICH_jll"

One thing is however solved (don't know thanks to which change): tests no longer error on test_error.jl as previously (not reported).

luraess avatar Apr 16 '22 20:04 luraess

Also for me running the tests with Julia v1.7.2 still picks up MPICH_jll instead of the system MPI.

giordano avatar Apr 18 '22 09:04 giordano

What I'm doing right now to run the tests with system MPI is to activate the test environment and run MPIPreferences.use_system_binary(). Then, I replaced https://github.com/JuliaParallel/MPI.jl/blob/49dcc801a3a2cd2c2f6c219c85416e404cf98c48/test/runtests.jl#L2-L9 with

    @show Base.load_path()
    test_project = first(Base.load_path())
    @show test_project
    preferences_file = joinpath(@__DIR__, "LocalPreferences.toml")
    @show preferences_file
    test_preferences_file = joinpath(dirname(test_project), "LocalPreferences.toml")
    @show test_preferences_file
    if isfile(preferences_file) && !isfile(test_preferences_file)
        @info "Copying!"
        cp(preferences_file, test_preferences_file)
    end
    @show readdir(dirname(test_project))

(yes, I removed the VERSION <= v"1.8-" check for the reason explained above), and now MPI.jl is correctly picking up system MPI. But this is a bit too convoluted :slightly_smiling_face:

giordano avatar Apr 18 '22 10:04 giordano

@giordano I managed to reproduce what is happening for 1.8. You still need to copy LocalPreferences.toml to test/

@staticfloat do you remember why we decided against propagating the settings from the environment in which the user calls ]test? I remember us discussing it, but I don't recall the conclusion.

vchuravy avatar Apr 18 '22 13:04 vchuravy

Can also confirm that tests now pick up the system MPI implementation upon manually copying the LocalPreferences.toml to test/ (otherwise it fails) in the setting described in https://github.com/JuliaParallel/MPI.jl/pull/564#issuecomment-1100751384.

luraess avatar Apr 18 '22 20:04 luraess

MPI test still don't pick-up correct MPI implementation (and fail) if run outside of the MPI.jl dev'ed or cloned repo.

luraess@superzack:~/scratch/dev/MPI-rocmaware-dev$ juliap
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.1 (2021-12-22)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

(@v1.7) pkg> activate .
  Activating new project at `/scratch-1/luraess/dev/MPI-rocmaware-dev`

(MPI-rocmaware-dev) pkg> add /scratch-1/luraess/dev/MPI.jl
    Updating git-repo `/scratch-1/luraess/dev/MPI.jl`
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...

(MPI-rocmaware-dev) pkg> add MPIPreferences
   Resolving package versions...
    Updating `/scratch-1/luraess/dev/MPI-rocmaware-dev/Project.toml`
  [3da0fdf6] + MPIPreferences v0.1.1
  No Changes to `/scratch-1/luraess/dev/MPI-rocmaware-dev/Manifest.toml`

(MPI-rocmaware-dev) pkg> st
      Status `/scratch-1/luraess/dev/MPI-rocmaware-dev/Project.toml`
  [da04e1cc] MPI v0.20.0-dev `/scratch-1/luraess/dev/MPI.jl#lr/rocmaware-dev`
  [3da0fdf6] MPIPreferences v0.1.1

julia> using MPI

julia> MPI.MPI_LIBRARY_VERSION_STRING
"MPICH Version:\t4.0.2\nMPICH Release date:\tThu Apr  7 12:34:45 CDT 2022\nMPICH ABI:\t14:2:2\nMPICH Device:\tch3:nemesis\nMPICH configure:\t--prefix=/workspace/destdir --build=x86_64-linux-musl --host=x86_64-linux-gnu --enable-shared=yes --enable-static=no --with-device=ch3 --disable-dependency-tracking --enable-fast=all,O3 --docdir=/tmp --disable-opencl\nMPICH CC:\tcc    -DNDEBUG -DNVALGRIND -O3\nMPICH CXX:\tc++   -DNDEBUG -DNVALGRIND -O3\nMPICH F77:\tgfortran   -O3\nMPICH FC:\tgfortran   -O3\n"

julia> MPI.use_system_binary()
┌ Info: MPI implementation
│   libmpi = "libmpi"
│   version_string = "Open MPI v4.1.2, package: Open MPI luraess@superzack Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021\0"
│   impl = "OpenMPI"
│   version = v"4.1.2"
└   abi = "OpenMPI"
┌ Warning: The underlying MPI implementation has changed. You will need to restart Julia for this change to take effect
│   libmpi = "libmpi"
│   abi = "OpenMPI"
│   mpiexec = "mpiexec"
└ @ MPIPreferences ~/.julia/packages/MPIPreferences/uArzO/src/MPIPreferences.jl:119

shell> ls
LocalPreferences.toml  Manifest.toml  Project.toml

shell> cat Project.toml
[deps]
MPI = "da04e1cc-30fd-572f-bb4f-1f8673147195"
MPIPreferences = "3da0fdf6-3ccc-4f1b-acd9-58baa6c99267"

Julia> exit()
luraess@superzack:~/scratch/dev/MPI-rocmaware-dev$ juliap
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.7.1 (2021-12-22)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> using MPI

julia> MPI.MPI_LIBRARY_VERSION_STRING
"Open MPI v4.1.2, package: Open MPI luraess@superzack Distribution, ident: 4.1.2, repo rev: v4.1.2, Nov 24, 2021\0"

(MPI-rocmaware-dev) pkg> test MPI
     Testing MPI
     [...]
     Testing Running tests...
┌ Info: Running MPI tests
│   ArrayType = Array
│   nprocs = 2
│   MPIPreferences.abi = "MPICH"
└   MPIPreferences.binary = "MPICH_jll"

luraess avatar Apr 19 '22 07:04 luraess

But this doesn't look like you copied the LocalPreferences.toml into test/? I agree that this is bad UX, but we might have to live with this since this needs to change in Pkg.jl

vchuravy avatar Apr 19 '22 07:04 vchuravy

Sorry, for not having specified that I am using the local MPI.jl project where the LocalPreferences.toml was copied to test/

shell> pwd
/home/luraess/scratch/dev/MPI-rocmaware-dev

shell> ls /home/luraess/scratch/dev/MPI.jl/test/
LocalPreferences.toml
[...]

luraess avatar Apr 19 '22 07:04 luraess

Can you add some debug print statements to the block that is copying the LocalPreferences.toml into the test temporary project? To see what it is locating and trying to do? especially add something to the cp line, if it isn't copying there is no chance it picks up the right thing.

vchuravy avatar Apr 19 '22 08:04 vchuravy

Uh, I did copy manually the LocalPreferences.toml into test/ and am on my (up-to-date) fork of MPI.jl#master having added https://github.com/JuliaParallel/MPI.jl/blob/41121ad3eb4c2f986246d6872154586f142c7022/test/runtests.jl#L1-L9 from vc/propagate_preferences_to_test branch to my runtests.jlscript.

luraess avatar Apr 19 '22 08:04 luraess

@vchuravy is this ready to merge?

simonbyrne avatar Apr 28 '22 04:04 simonbyrne

@vchuravy is this ready to merge?

simonbyrne avatar Apr 28 '22 04:04 simonbyrne

Closing this, as we have a workaround for 1.6 & 1.7, and it works correctly on 1.8

simonbyrne avatar Oct 03 '22 20:10 simonbyrne