Prozess icon indicating copy to clipboard operation
Prozess copied to clipboard

Got a response when no response was expected!

Open vstoyak opened this issue 10 years ago • 7 comments

Getting "Got a response when no response was expected!" which fails the client. Any help with what could be wrong?

vstoyak avatar Apr 21 '14 15:04 vstoyak

We've seen that in a few scenarios, but it's tough for me to guess which one you're experiencing. Usually it happens when you give an incorrect offset in a fetch, meaning that you're probably not storing/passing the offsets correctly. I haven't seen this in a very long time though.

cainus avatar Apr 21 '14 17:04 cainus

+1 to @cainus' comment. one way to possibly diagnose the issue would be to investigate in and around the offset with the bundled Kafka command line tools. Issuing the following from the bin dir:

 $ ./kafka-simple-consumer-shell.sh --server kafka://kafka-broker.com:9092 --topic mytopic --offset 8675309 --print-messages --print-offsets 

Where the argument to --offset is the offset you're fetching in Prozess, may help you to diagnose the issue.

elee avatar Apr 21 '14 17:04 elee

Thanks! will do. One other question, why is it handled through "throw" with no callback call with error message?

vstoyak avatar Apr 21 '14 22:04 vstoyak

Ohh... good question... Now that I look at the code, that bad-offset case is actually caught earlier. The exception in this case is one that should never be raised, so I left it as a throw. It shouldn't be possible to trigger this, but I often throw in the default block of a switch statement as a safeguard when there is no logical default. Sooo... now that you've accomplished what I anticipated would be impossible, I'd love to know if you have any steps to reproduce this. ;)

Also, just for sanity's sake: You're using Kafka 0.7 right?

cainus avatar Apr 21 '14 23:04 cainus

Unfortunately it is hard to reproduce and I am yet to identify any patterns, it can happen few days apart. Yes, we use 0.7

vstoyak avatar Apr 21 '14 23:04 vstoyak

I just ran into this same issue. My application usually runs fine for a while (1 - 6 hours), then exits with this error. I have an theory on what is going on. I don't see any code in Consumer.js that prevents a user from calling consume, then calling it again before the first call has completed. If this occurs, you would be waiting for 2 socket responses from kafka. When you receive the first one, it would set the requestMode to null so when the second response was received you would hit the default switch code that generates this error. I eliminated this problem in my code by not calling consume within an interval timer (my guess is that sometimes the response was delayed long enough that the interval fired again and I made another call to consume). In the prozess code maybe check requestMode when consume is called and return an error if it is not null. This would prevent a second request from being sent to Kafka until the first one has completed??

kankele avatar Jun 19 '14 17:06 kankele

Hmmm I'm a bit torn on how to solve this, since I don't have time to actually test it properly. It would be easy enough to just return the error in a callback, but it would be something lame along the lines of:

  callback(new Error('you just found a rare race condition bug.  just refetch'));

This is obviously a stupid fix, even though it's an improvement over the current state.

Of course if you've got the time to test your proposed better solution, I'd certainly be happy to take a PR.

cainus avatar Jun 21 '14 20:06 cainus