lettuce icon indicating copy to clipboard operation
lettuce copied to clipboard

CommandHandler.exceptionCaught does incorrect behavior when occuring exception in addToStack()

Open hyunwookwi opened this issue 6 years ago • 6 comments

lettuce version: 4.4.2.Final

when exception occured before stack.add(commandToUse) in addToStack(), Netty call CommandHandler.exceptionCaught(). CommandHandler.exceptioCaught() process stack.poll();

I think this behavior seems to be wrong. because The command that failed due to an exception is not yet on the stack. Please check your code.

Currently, I am using lettuce to get/set cache data with redis. My application send set command using lettuce, and I expected "OK" response. But sometimes, my app received response of previously get command. And previously sent get command occur TimeoutException.

hyunwookwi avatar Jun 05 '18 05:06 hyunwookwi

Good catch. Do you have an example of an exception that may happen in the code path before the command is added to the protocol stack?

mp911de avatar Jun 05 '18 06:06 mp911de

If my application doesn't have enough memory, JRE throw OutOfMemory while processing stack.add(commandToUse);

Thank you.

hyunwookwi avatar Jun 05 '18 06:06 hyunwookwi

Lettuce can't account for VM errors. If an Error is thrown, then the whole driver gets into an undefined state.

mp911de avatar Jun 05 '18 07:06 mp911de

However, I think lettuce should handle it when all kinds of exceptions are occured. Otherwise, any subsequent Commands cannot receive a expected response anymore.

Or you can disconnect channel and rebuild it.

hyunwookwi avatar Jun 05 '18 07:06 hyunwookwi

To quote a recommendation:

It may work, but it is generally a bad idea. There is no guarantee that your application will succeed in recovering, or that it will know if it has not succeeded. For example:

There really may be not enough memory to do the requested tasks, even after taking recovery steps like releasing block of reserved memory. In this situation, your application may get stuck in a loop where it repeatedly appears to recover and then runs out of memory again.

The OOME may be thrown on any thread. If an application thread or library is not designed to cope with it, this might leave some long-lived data structure in an incomplete or inconsistent state.

If threads die as a result of the OOME, the application may need to restart them as part of the OOME recovery. At the very least, this makes the application more complicated.

Suppose that a thread synchronizes with other threads using notify/wait or some higher level mechanism. If that thread dies from an OOME, other threads may be left waiting for notifies (etc) that never come ... for example. Designing for this could make the application significantly more complicated.

However, let's see whether we can do anything here at all.

mp911de avatar Jun 05 '18 07:06 mp911de

I hope my application receive an error or exception at all rather than an wrong response value.

have a nice day.

hyunwookwi avatar Jun 05 '18 07:06 hyunwookwi

I hope my application receive an error or exception at all rather than an wrong response value.

have a nice day.

i agree. Incorrect data is very dangerous to business programs.

734839030 avatar Jan 03 '23 06:01 734839030

I strongly agree with @hyunwookwi and @734839030 .

@mp911de In fact, I have seen a few cases of mixed incorrect response data due to OOME. This is a very dangerous situation in business. It could lead to the leakage of personal information.

In the case of OOME due to heap exhaustion, it can be avoided by specifying ExitOnOutOfMemoryError. However, in the case of OOME due to direct buffer exhaustion, ExitOnOutOfMemoryError does not work, and the application will continue to operate with incorrect response data returned.

Of course, it is difficult for a library to do the right thing for OOME, but can we stop returning wrong responses? It would be preferable for the application to crush rather than return incorrect response data.

be-hase avatar Dec 08 '23 10:12 be-hase

The exception handling and connection handling have been revised quite a few times since the original report. Do you have stack traces handy?

mp911de avatar Dec 08 '23 14:12 mp911de

Sorry, unfortunately I do not have the stack traces at hand. I am trying to reproduce it, but I can't seem to write a small reproduction code.

Maybe it is the same as this one. https://github.com/lettuce-io/lettuce-core/issues/2528

For example, I think both heap and direct buffer could be OOME during encode. https://github.com/lettuce-io/lettuce-core/blob/c8d9011d5299dd6816c52cc404945fbfcf1157ac/src/main/java/io/lettuce/core/codec/StringCodec.java#L134-L148 https://github.com/netty/netty/blob/4.1/buffer/src/main/java/io/netty/buffer/Unpooled.java#L185-L211

be-hase avatar Dec 08 '23 16:12 be-hase

Hmmm. But maybe it's different because when it's OOME, it doesn't go into this block. https://github.com/lettuce-io/lettuce-core/blob/main/src/main/java/io/lettuce/core/protocol/CommandEncoder.java#L95-L97

be-hase avatar Dec 08 '23 17:12 be-hase