Distributed.jl icon indicating copy to clipboard operation
Distributed.jl copied to clipboard

Distributed docs are a bit contradictory as to if it loads packages you are `using` on the manager process

Open oxinabox opened this issue 6 years ago • 2 comments

The answer is: Yes it loads them (requires them, i.e. don't bring into scope), on all workers, when you using them on the manager, but when you start a new worker, it doesn't start with the ones that are currently loaded on the manager process.

https://docs.julialang.org/en/v1/manual/parallel-computing/#code-availability-1 first says:

Finally, if DummyModule.jl is not a standalone file but a package, then using DummyModule will load DummyModule.jl on all processes, but only bring it into scope on the process where using was called.

Then later kind of contradicts that:

Note that workers do not run a ~/.julia/config/startup.jl startup script, nor do they synchronize their global state (such as global variables, new method definitions, and loaded modules) with any of the other running processes.

The second bit is kind wrong. It does synconize loaded modules.

oxinabox avatar Oct 30 '19 11:10 oxinabox

➜  ~ julia -p 2
               _
   _       _ _(_)_     |  Documentation: https://docs.julialang.org
  (_)     | (_) (_)    |
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 1.2.0 (2019-08-20)
 _/ |\__'_|_|_|\__'_|  |
|__/                   |

julia> using SIMD

julia> remotecall_fetch(()->Main.SIMD.Vec{2, Float64}((0, 0)), 2)
<2 x Float64>[0.0, 0.0]

julia> addprocs(1)
1-element Array{Int64,1}:
 4

julia> remotecall_fetch(()->Main.SIMD.Vec{2, Float64}((0, 0)), 4)
ERROR: On worker 4:
KeyError: key SIMD [fdea26ae-647d-5447-a871-4b548cad5224] not found
deserialize_global_from_main at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Serialization/src/Serialization.jl:722
JuliaLang/julia#5 at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/clusterserialize.jl:72 [inlined]
foreach at ./abstractarray.jl:1920
deserialize at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/clusterserialize.jl:72
JuliaLang/julia#105 at ./task.jl:268
Stacktrace:
 [1] #remotecall_fetch#149 at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/remotecall.jl:379 [inlined]
 [2] remotecall_fetch(::Function, ::Distributed.Worker) at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/remotecall.jl:371
 [3] #remotecall_fetch#152(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(remotecall_fetch), ::Function, ::Int64) at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/remotecall.jl:406
 [4] remotecall_fetch(::Function, ::Int64) at /build/julia/src/julia-1.2.0/usr/share/julia/stdlib/v1.2/Distributed/src/remotecall.jl:406
 [5] top-level scope at REPL[4]:1

No processes added afterwards do not synchronize loaded modules.

vchuravy avatar Oct 31 '19 13:10 vchuravy

No processes added afterwards do not synchronize loaded modules.

Indeed, but ones added before do.

Maybe we should include that example in the docs. (Probably witha a add_procs at that start rather than -p, for shortness.0 That would clear things up.

oxinabox avatar Oct 31 '19 15:10 oxinabox