ZMQ.jl memory leak

memory leak

Open StefanKarpinski opened this issue 9 years ago • 26 comments

Example code:

pid = getpid()
vsz(s) = println(s*split(open(readall,`ps -p $pid -o vsz`),"\n")[2])
vsz("Initial VSZ=")

using ZMQ
vsz("After loading ZMQ, my VSZ=")

ctx = Context()
socket = Socket(ctx, PUB)
ZMQ.bind(socket, "ipc:///tmp/testZMQ")

vsz("After setting up ZMQ, my VSZ=")
println("Sending")
for i = 1:10000000
    ZMQ.send(socket, "abcdefghijklmnopqrstuvwxyz")
    if i % 100000 == 0
        println("Sent $i messages")
        println("Length of gc_protect: $(length(ZMQ.gc_protect))")
        vsz("My current VSZ=")
    end
end
vsz("Final VSZ=")

The virtual size keeps growing endlessly.

Mar 22 '15 11:03 StefanKarpinski

Cc: @tanmaykm @amitmurthy

Mar 23 '15 09:03 ViralBShah

Seems to be fixed. Thank you, @Keno!

Mar 23 '15 18:03 StefanKarpinski

Was this a problem with both 0.3 and 0.4?

Mar 24 '15 08:03 tkelman

Yes.

Mar 24 '15 09:03 ViralBShah

And after 791b5d4af2c2fb029e4a38b291726964a0515dcf in the package, 0.3 still leaks memory?

Mar 24 '15 09:03 tkelman

@StefanKarpinski knows the details best about what had to be done on 0.3. Let's wait for him to chime in.

Mar 24 '15 09:03 ViralBShah

I'm going to sleep. @staticfloat may be online for a little while, and can do anything necessary with binaries. If we decide to immediately backport the corresponding Julia commit and re-tag, I'd personally be in favor of leaving the 0.3.7 tag in place since who knows how many people have fetched it by now, and just go straight to 0.3.8.

Mar 24 '15 09:03 tkelman

Yes, it can certainly wait a week or two for 0.3.8.

Mar 24 '15 10:03 ViralBShah

I should have fixed this on both 0.3 and 0.4.

Mar 24 '15 14:03 Keno

So this original example still memory leaks – just much more slowly than before. Looking into the cause.

Feb 12 '16 19:02 StefanKarpinski

Cc @tanmaykm, since you are also a heavy user of this package...

Feb 15 '16 05:02 ViralBShah

This seems to be a bug upstream

Feb 15 '16 09:02 nkottary

Can see the leak even with just open and close of sockets.

test code: https://gist.github.com/tanmaykm/8352059108c6b34f5ecf

Feb 15 '16 10:02 tanmaykm

I see that leaks are present even after closing the context after calling doopenclose in the above script. Calling zmq_unbind for each bind prevents these. I've added it here.

This does not fix the bug.

Feb 15 '16 13:02 nkottary

The original script to reproduce this leaks for a different reason – the calls to readall, but these scripts also shows a memory leak:

Julia Version 0.4.0
Commit 0ff703b* (2015-10-08 06:20 UTC)
Platform Info:
  System: Linux (x86_64-redhat-linux)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT NO_AFFINITY SANDYBRIDGE)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Feb 26 '16 17:02 StefanKarpinski

Similar leakage on OS X:

Julia Version 0.4.4-pre+26
Commit 386d77b (2016-01-29 21:53 UTC)
Platform Info:
  System: Darwin (x86_64-apple-darwin14.5.0)
  CPU: Intel(R) Core(TM) M-5Y71 CPU @ 1.20GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

Feb 26 '16 19:02 StefanKarpinski

The only operation in the loop in this script is ZMQ.send(socket, "abcd"), so there's a leak in the code that creates the ZMQ message object and sends it. It seems highly dubious that ZMQ's send code having a memory leak, so I'm guessing this is about how we are creating message objects.

Feb 26 '16 19:02 StefanKarpinski

I suspect it's due to the finalizer.

Feb 26 '16 19:02 yuyichao

Ah, good thought, @yuyichao!

Feb 26 '16 19:02 StefanKarpinski

I'm trying to rebase and fix https://github.com/JuliaLang/julia/pull/13995 now ...

Feb 26 '16 20:02 yuyichao

Hmm, it seems that you are plotting the virtual address space size? It's not the most useful measure since you are mostly measuring the 8G gc memory pool. This also kind of means that the leak is not in the GC pool objects.....

Feb 27 '16 09:02 yuyichao

That's a fair point and I'm happy to measure something else, but this does reflect the impact of the program from the system's perspective – and it keeps using more and more resources while doing a very trivial loop.

Feb 29 '16 20:02 StefanKarpinski

I agree, I just mean that the reason of the leak is a little strange since it's apparently not https://github.com/JuliaLang/julia/pull/13993 and isn't really fixed by https://github.com/JuliaLang/julia/pull/13995

Feb 29 '16 20:02 yuyichao

The leak maybe in libuv like https://github.com/JuliaLang/julia/issues/13529 probably is due to a libuv issue.

Mar 01 '16 04:03 amitmurthy

Has anybody run massif on this?

Mar 01 '16 04:03 Keno

Hey all. So looking into things, people don't advise finalizers. They advise using https://docs.julialang.org/en/latest/manual/functions/#Do-Block-Syntax-for-Function-Arguments-1. This is recommended by Tim Holy: https://github.com/JuliaLang/julia/issues/11207#issuecomment-100469273

I'm thinking that we should not rely on finalizers. The issue is that lifetimes aren't strictly managed by scopes in Julia, things are garbage-collected. If lifetimes aren't managed by scopes, then finalizers could run whenever the gc is tuned to, after the scope closes. That's just how resource management is done in Julia. So while memory management could be handled by Julia's gc, sockets, contexts, and messages shouldn't be, because they could hang around until gc deigns to release them and this would result in resource leaks, specifically open threads and memory. This requires a bit of a redesign, obviously.

I can do it (actually I already did it in my own clone of ZMQ.jl). Would people be interested in this? The nice bit is that it really only involves removing a lot of code from ZMQ.jl, simplifying interfaces. It also makes the Julia bindings more in line with both ZMQ and Julia paradigms.

I am thinking we can remove the Julia bindings that relate to ZMQ contexts entirely. AFAICT there really is only one use case for having more than one ZMQ context: ZMQ being imported in multiple places. That can be accomplished by a global variable holding the context handle. Julia takes care of each import having its own global, and then we can hide contexts from the user nearly altogether. (If users really want to control contexts, we can make some package-level functions for that.)

Thoughts?

Dec 07 '18 21:12 joelfrederico

ZMQ.jl ZMQ.jl copied to clipboard

memory leak

ZMQ.jl
ZMQ.jl copied to clipboard