arrow-julia icon indicating copy to clipboard operation
arrow-julia copied to clipboard

Memory leak(?) in `Arrow.write()`

Open Moelf opened this issue 4 years ago • 1 comments

@aminnj noticed this first, download the Run2012... file from http://opendata.web.cern.ch/record/12341/files/Run2012BC_DoubleMuParked_Muons.root and the setup:

# call this test_partition.jl
using UnROOT, Tables, Arrow ; const tf = LazyTree(ROOTFile("Run2012BC_DoubleMuParked_Muons.root"),"Events");

function _lockedget(t::LazyTree, r::UnitRange)
           f = getproperty(t,first(propertynames(t))).f
           lock(f)
           @show r
           GC.gc()
           try
               return t[r]
           catch
           finally
               unlock(f)
           end
       end

Tables.partitions(t::LazyTree) = (_lockedget(t, r) for r in UnROOT._clusterranges(t))

Then, setting memory limit to 3GB:

$ ulimit -Sv 3000000

then run one of these:

include("test_partition.jl")

# this finishes
for t in Tables.partitions(tf)
    @show length(t)
end
include("test_partition.jl")

# this crashes
tf |> Arrow.write("doublemu.arrow", ntasks=1)

So UnROOT.jl on it's own isn't leaking memory, so maybe we're doing something that doesn't play naively with Arrow?

Moelf avatar Oct 02 '21 12:10 Moelf

now I'm wondering if this is related to #237

Moelf avatar Nov 29 '23 12:11 Moelf