HDF5.jl
HDF5.jl copied to clipboard
Automatic h5_garbage_collect() garbage collection
Good afternoon,
There might be a memory leak in HDF5, related to using driver=Drivers.Core(; backing_store=false).
I created a reduced exampled that can be reproduced as follows:
- generate a docker file including HDF5
# build with -> docker build -t hdf5test:1.0 .
FROM julia:1.11.2
RUN julia -e "import Pkg; Pkg.add([\"HDF5\", \"H5Zblosc\"])"
ENTRYPOINT ["julia"]
- run the following code in the docker container (e.g., run
sudo docker run -it --memory=500m hdf5test:1.0and copy the code ), it will be killed for OOM reason sooner or later
using HDF5
function main()
while true
h5open("abc.h5", "w"; driver=Drivers.Core(; backing_store=false)) do fid
fid["M"] = randn(1000, 1000)
return Vector{UInt8}(fid)
end
# GC.gc() # enabling or diabling doesnt change much
end
return nothing
end
main()
The container memory will immediately jump close to the limit and stay there for a while, for higher memory cap, it will take longer for the container to be killed. Once the container is killed, to be sure it was due to memory, you can docker inspect <containerid>
Best regards, Christian Dengler
Could you see if invokingHDF5.API.h5_garbage_collect() helps?
https://github.com/JuliaIO/HDF5.jl/blob/master/src%2Fapi%2Ffunctions.jl#L67
I did a quick test, including this in the loop seems to stabilize the memory usage. I guess this is not a bug then? Or should this be called automatically somehow?
I would consider this to be a workaround for now.
I need to investigate further how well this is documented upstream in HDF5 itself, and when would be appropriate to call this automatically.
Perhaps a HDF5.gc() would br warranted if this is needed to be called by the a user.
Ok, Ill keep this ticket open in that case
Ideally we should call this when the Julia GC is invoked, but we probably don't want to call it every time an object is freed.
One way to do this would be to add a callback into the Julia GC (so it gets called after the Julia GC is invoked). This can be done by calling jl_gc_set_cb_post_gc with a function pointer. The downside is that we can't call actual Julia code, so we would have to write a C shim around it. This is what I did for NVTX.jl:
https://github.com/JuliaGPU/NVTX.jl/blob/main/src/julia.jl
In this case with the do syntax, I think we could call thr HDF5 GC when closing the "file" when we know that file is backed by allocated memory.
Can I help somehow with this issue? I'd like MAT.jl to be in a good state.
Can I help somehow with this issue? I'd like MAT.jl to be in a good state.
Could you see if invoking HDF5.API.h5_garbage_collect() helps?
I'm not sure if the issues are linked. The next step would be to create the high level function.