MPI.jl
MPI.jl copied to clipboard
Workers don't spool up on first run of passive RMA example
Weird behavior, the first time I ran the passive RMA example, I got out:
> mpiexec -n 4 julia --project passive_rma.jl
After Put with lock / unlock, window content on rank 0:
all_ranks = [0, -1, -1, -1]
All subsequent runs gave the desired:
> mpiexec -n 4 julia --project passive_rma.jl
After Put with lock / unlock, window content on rank 0:
all_ranks = [0, 1, 2, 3]
The subsequent runs were started immediately so there were no changes to the code or environment. I tried to reproduce the behavior by deleting recent files in .julia/compiled/v1.9/MPI, but that did not reproduce the effect. I wonder if there should be another MPI.Win_fence(0, win) after creation of win?
Julia Version 1.9.3
Commit bed2cd540a (2023-08-24 14:43 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 24 × AMD Ryzen 9 5900 12-Core Processor
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-14.0.6 (ORCJIT, znver3)
Threads: 8 on 24 virtual cores
Environment:
JULIA_EDITOR = code
JULIA_NUM_THREADS = 8
.julia\environments\v1.9\Project.toml`
[6e4b80f9] BenchmarkTools v1.3.2
[336ed68f] CSV v0.10.11
[052768ef] CUDA v5.1.0
[35d6a980] ColorSchemes v3.24.0
[a93c6f00] DataFrames v1.6.1
[5789e2e9] FileIO v1.16.1
[e9467ef8] GLMakie v0.8.12
[f67ccb44] HDF5 v0.17.1
[916415d5] Images v0.26.0
[a98d9a8b] Interpolations v0.14.7
[42fd0dbc] IterativeSolvers v0.9.3
[033835bb] JLD2 v0.4.38
[ba0b0d4f] Krylov v0.9.4
[7ed4a6bd] LinearSolve v2.19.0
[da04e1cc] MPI v0.20.18
[3da0fdf6] MPIPreferences v0.1.10
[299715c1] MarchingCubes v0.1.8
[7269a6da] MeshIO v0.4.10
[eacbb407] Meshes v0.35.17
[2679e427] Metis v1.4.0
[91a5bcdd] Plots v1.39.0
[42171d58] PlyIO v1.1.2
[dc215faf] ReadVTK v0.2.0
[90137ffa] StaticArrays v1.6.5
[286e6d88] SymRCM v0.2.1
[64499a7a] WriteVTK v1.18.1
[de0858da] Printf
[2f01184e] SparseArrays
I don't know enough about RMA to know if this is correct, but I will note that it is also using the older syntax.
I'm not 100% sure that example is correct. An improved one would be much appreciated!