lettuce icon indicating copy to clipboard operation
lettuce copied to clipboard

ClusterCommand error completeExceptionally cause OOM

Open Wuuqy opened this issue 1 year ago • 4 comments

Bug Report

Class ClusterCommand completeExceptionally function cannot notice onComplete function to run, maybe cause many timeout task and finally OOM.

Current Behavior

Stack trace
// your stack trace here;

Input Code

Input Code
// ClusterCommand.java
public boolean completeExceptionally(Throwable ex) {
        // why not super.completeExceptionally(ex)? 
        // now it may cause ClusterCommand.onComplete function not invoke
        // cause CommandExpiryWriter.timer hold many elements and finally OOM if timeout is big
        boolean result = command.completeExceptionally(ex);
        completed = true;
        return result;
}

// CommandExpiryWriter.java
private void potentiallyExpire(RedisCommand<?, ?, ?> command, ScheduledExecutorService executors) {
        long timeout = applyConnectionTimeout ? this.timeout : source.getTimeout(command);
        if (timeout <= 0) {
            return;
        }
        Timeout commandTimeout = timer.newTimeout(t -> {
            if (!command.isDone()) {
                executors.submit(() -> command.completeExceptionally(
                        ExceptionFactory.createTimeoutException(Duration.ofNanos(timeUnit.toNanos(timeout)))));

            }
        }, timeout, timeUnit);

        if (command instanceof CompleteableCommand) {
            // currently can not RUN when ClusterCommand invoke completeExceptionally, so timeout task cannot cancel
            ((CompleteableCommand<?>) command).onComplete((o, o2) -> commandTimeout.cancel());
        }

    }

Expected behavior/code

Environment

  • Lettuce version(s): [e.g. Main]
  • Redis version: [e.g. ALL]

Possible Solution

// ClusterCommand.java
  public boolean completeExceptionally(Throwable ex) {
      boolean result = super.completeExceptionally(ex);
      completed = true;
      return result;
  }

Additional context

Wuuqy avatar Aug 19 '24 09:08 Wuuqy

Hey @Wuuqy ,

I am having trouble understanding the issue you are describing. Could you provide a way for me to reproduce it?

            // currently can not RUN when ClusterCommand invoke completeExceptionally, so timeout task cannot cancel
            ((CompleteableCommand<?>) command).onComplete((o, o2) -> commandTimeout.cancel());

This got me really confused. TimeoutTask does should not call this, there are two options here:

  1. The timer counts down and if the command is not yet done it runs .completeExceptionally() or
  2. The command is executed (and onComplete() is called) which cancels the timer

I do - however - feel we might want to call the super-method, because it seems to me we are not guarded against #1576

tishun avatar Aug 26 '24 15:08 tishun

I do - however - feel we might want to call the super-method, because it seems to me we are not guarded against #1576

@mp911de, what do you think?

tishun avatar Aug 26 '24 15:08 tishun

command.completeExceptionally should be replaced with a call to super.completeExceptionally(…). Probably an oversight when migrating to CommandWrapper as already the previous code used command.completeExceptionally and not super.…

mp911de avatar Aug 28 '24 07:08 mp911de

command.completeExceptionally should be replaced with a call to super.completeExceptionally(…). Probably an oversight when migrating to CommandWrapper as already the previous code used command.completeExceptionally and not super.…

Thanks for the confirmation @mp911de and thanks for the report @Wuuqy !

tishun avatar Aug 28 '24 08:08 tishun

Hi @tishun @mp911de I created a pr for this, please help review 🙇

thachlp avatar Sep 09 '24 02:09 thachlp