jeromq icon indicating copy to clipboard operation
jeromq copied to clipboard

isDelimiter NPE

Open graphiclife opened this issue 9 years ago • 5 comments

Keep getting crashes in Pipe.java. Seems that inpipe.checkRead() returns true, but afterwards inpipe.probe() returns null.

Fatal Exception: java.lang.NullPointerException
       at zmq.Pipe.isDelimiter(Pipe.java:472)
       at zmq.Pipe.checkRead(Pipe.java:183)
       at zmq.SessionBase.readActivated(SessionBase.java:298)
       at zmq.XSub$XSubSession.readActivated(XSub.java:24)
       at zmq.Pipe.processActivateRead(Pipe.java:290)
       at zmq.ZObject.processCommand(ZObject.java:57)
       at zmq.IOThread.inEvent(IOThread.java:93)
       at zmq.Poller.run(Poller.java:247)
       at java.lang.Thread.run(Thread.java:838)

Code that triggers the crash:

https://gist.github.com/graphiclife/27c4c5fa68d8c3fe2ff7

graphiclife avatar Dec 29 '15 11:12 graphiclife

I was not able to reproduce the NPE with your example code. Did you use the latest commit on master? Does the NPE always happen or on an irregular pattern?

markif avatar Jan 11 '16 10:01 markif

I don't know much about the codebase yet, but it looks like the cause might be inpipe.probe() returning null on this line.

We're not getting an assertion error in the probe function, which means we must be returning queue.front() and getting null there.

Looking at queue.front(), it's just getting a value from an array by index. And we're not getting an OutOfBoundsException, which leads me to believe there must have been a null Msg in the queue.

I have no idea why that would be the case, but thought I'd leave my notes here in case it helps someone who may know more.

daveyarwood avatar Sep 08 '16 02:09 daveyarwood

I took a brief look at the code again and it appears to still be in about the same state. My assessment now is the same it was a year ago.

Information needed:

  • An explanation by someone who understands the code.
  • What does it mean for a null Msg to be in the queue?

daveyarwood avatar Oct 05 '17 03:10 daveyarwood

Is there a known way to trigger this? I saw this exact NPE on a running server my company has, and cannot figure out how we got there. I have been trying to put garbage onto our pipe but cannot manually induce it. All the referenced code from the original investigation 404s on me.

If someone could explain how you get a null message into the queue I could at least try to see if some of the new messaging code is somehow triggering that, since I can't directly fix an NPE in jeromq. Thanks!

bahostetterlewis avatar Feb 23 '18 02:02 bahostetterlewis

It might be fixed with the following change:

07dcebd#diff-da2d6c91e26a787671da920e8bf2d452R103

It was actually happening on NetMQ (which was a port of jeromq) and we fixed it a few years back.

somdoron avatar May 09 '20 06:05 somdoron