lettuce icon indicating copy to clipboard operation
lettuce copied to clipboard

Intermittent RedisCommandTimeoutException in Lettuce Client

Open priyavaddineni opened this issue 1 month ago • 4 comments

Hello,

We're experiencing random RedisCommandTimeoutException spikes in our production environment during Lettuce read operations (sync mget commands) against AWS ElastiCache replication group with 400ms timeout (latency generally around 5-10ms), SSL, and IAM auth enabled.

All metrics looks normal - no TPS spikes, connection pool is normal, thread utilization stable, auth connections refreshing properly(every 12 hrs), and ElastiCache GetTypeCmdsLatency shows normal latency. Timeouts occur sporadically without pattern and we are not observing this in our pre-prod environments. We don't have the lettuce debug logs enabled as they are too noisy.

Seeking guidance on potential causes and optimal Lettuce configuration to resolve these intermittent timeouts.

Caused by: io.lettuce.core.RedisCommandTimeoutException: Command timed out after 400 millisecond(s)
    at io.lettuce.core.internal.ExceptionFactory.createTimeoutException(ExceptionFactory.java:63)
    at io.lettuce.core.internal.Futures.awaitOrCancel(Futures.java:233)
    at io.lettuce.core.FutureSyncInvocationHandler.handleInvocation(FutureSyncInvocationHandler.java:79)
    at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:84)
    at jdk.proxy2/jdk.proxy2.$Proxy158.mget(Unknown Source)
    at jdk.internal.reflect.GeneratedMethodAccessor134.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:569)
    at io.lettuce.core.support.ConnectionWrapping$DelegateCloseToConnectionInvocationHandler.handleInvocation(ConnectionWrapping.java:200)
    at io.lettuce.core.internal.AbstractInvocationHandler.invoke(AbstractInvocationHandler.java:84)
    at jdk.proxy2/jdk.proxy2.$Proxy158.mget(Unknown Source)
// Connection Pool Config
GenericObjectPoolConfig<StatefulRedisConnection<String, String>> poolConfig =
        new GenericObjectPoolConfig<>();

    poolConfig.setMaxWait(500);
    poolConfig.setBlockWhenExhausted(true);
    poolConfig.setMaxTotal(128);
    poolConfig.setMaxIdle(32);
    poolConfig.setMinIdle(16);
    poolConfig.setTestOnBorrow(true);
    poolConfig.setTestWhileIdle(true);
    poolConfig.setMinEvictableIdleDuration(Duration.ofSeconds(60));
    poolConfig.setTimeBetweenEvictionRuns(Duration.ofSeconds(30));

// Redis URI Config
RedisURI.builder()
  .withHost(endpoint)
  .withPort(port)
  .withTimeout(Duration.ofMillis(500))
  .withSsl(true)
  .withVerifyPeer(false)
  .withAuthentication(iamAuthCredentialsProvider)

// Client Options
ClientOptions.builder()
  .autoReconnect(true)
  .disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)

// Client Resources
DefaultClientResources.builder()
  .ioThreadPoolSize(16)
  .computationThreadPoolSize(16)

priyavaddineni avatar Oct 30 '25 17:10 priyavaddineni

Hey @priyavaddineni , We are going through your issue In the meantime could you please provide your Redis and Lettuce versions ?

a-TODO-rov avatar Oct 31 '25 08:10 a-TODO-rov

Thank you for looking into it!

Redis engine version: 7.0.7 Lettuce version: 6.4

priyavaddineni avatar Oct 31 '25 21:10 priyavaddineni

Hey @priyavaddineni ,

unfortunately timeout exceptions are notoriously hard to track.

One think that I could not find in your analysis is - how much resources (CPU/threads) are allocated to the driver and does it use them all (have you checked for CPU spikes on the side of the client)?

Sometimes when the driver is overloaded and has no free resources to process the incoming replies from the server it would slow down and the commands that are waiting to be processed will timeout.

tishun avatar Nov 01 '25 07:11 tishun

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 2 weeks this issue will be closed.

github-actions[bot] avatar Dec 02 '25 00:12 github-actions[bot]