`Cluster` command spike on topology refresh leading to higher latencies.
Current Behavior
I am seeing cluster command spikes whenever a redis node gets replaced. These spikes make all other client calls slow leading to higher latencies. I have a service with 900 instances connecting to a redis cluster of size 300(150 primaries). I have tried multiple settings like enabling only adaptive refresh(dynamic and periodic disabled) and other combinations but it doesn't help. Looking through the lettuce code, it seems lettuce calls into 300 redis nodes(primaries and replica) and picks the topology based on which topology view knows about the largest number of existing nodes. I cannot tell if the adaptive trigger timeout works.
Stack trace
// your stack trace here;
Input Code
Input Code
redisClient.setOptions(
ClusterClientOptions.builder()
.autoReconnect(true)
.requestQueueSize(REQUEST_QUEUE_SIZE)
.cancelCommandsOnReconnectFailure(true)
.disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
.topologyRefreshOptions(
ClusterTopologyRefreshOptions.builder()
.enablePeriodicRefresh(false)
.enableAllAdaptiveRefreshTriggers()
.dynamicRefreshSources(false)
.build())
.timeoutOptions(
TimeoutOptions.builder()
.timeoutCommands(true)
.fixedTimeout(Duration.ofMillis(COMMAND_TIMEOUT_MS)) // command timeout
.build())
.build());
Environment
- Lettuce version(s) : 5.3.7, 6.2.1, 6.2.2, 6.2.3, 6.2.4
- Redis version: 6.0.16
Additional context
@mp911de any suggestions?
I also encountered it and the OS indicated a flood attack。 redis version:5.0.14 and lettuce version 6.0.1
With dynamicRefreshSources being disabled, Lettuce uses only the provided seed nodes provided in RedisClusterClient.create(…) instead of reaching out to all cluster nodes. These spikes indicate that some event has caused increased topology refreshes.
This view here is pretty high-level, you'd need to investigate on a spike, what has happened, ideally by capturing a debug log from one of the nodes.
If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 30 days this issue will be closed.