Dagger.jl icon indicating copy to clipboard operation
Dagger.jl copied to clipboard

Memory leak when copying data from one worker to another

Open RomeoV opened this issue 1 year ago • 2 comments

Allocating matrices on one worker and copying them to another worker repeatedly leads to a memory leak on my computer, and the Julia session being killed.

julia> using Distributed
julia> addprocs(8)
julia> using Dagger
julia> for _ in 1:5
       @time fetch(let
         foo = Dagger.@spawn scope=Dagger.scope(worker=1) rand(10000, 10000);
         Dagger.@spawn scope=Dagger.scope(worker=2) copy(foo)
       end);
       end

The foo matrix and its copy should be garbage collected, which I don't think they are. But even then, each matrix is 0.8GB, so if they exist 5 times on both workers we have 5 * 2 * 0.8GB=8GB of memory, which should not overflow my RAM. (I have at least 16GB free).

Session is a clean temp project with Dagger v0.18.12, Julia 1.10.4,

RomeoV avatar Jul 11 '24 23:07 RomeoV

Sorry for the slow reply - this is probably a known memory leak, also reported offline by @mofeing in a similar case. I'll investigate and see if I can resolve it.

jpsamaroo avatar Jul 23 '24 18:07 jpsamaroo

Using the example above, I've found the initial source of retained memory, and am fixing it in https://github.com/JuliaParallel/Dagger.jl/pull/558 (branch is very WIP, expect it to not work right now). I'll close this issue once that PR is merged, which should fully address this.

jpsamaroo avatar Jul 24 '24 21:07 jpsamaroo