lettuce icon indicating copy to clipboard operation
lettuce copied to clipboard

`Cluster` command spike on topology refresh leading to higher latencies.

Open kushwaha0791 opened this issue 2 years ago • 3 comments

Current Behavior

I am seeing cluster command spikes whenever a redis node gets replaced. These spikes make all other client calls slow leading to higher latencies. I have a service with 900 instances connecting to a redis cluster of size 300(150 primaries). I have tried multiple settings like enabling only adaptive refresh(dynamic and periodic disabled) and other combinations but it doesn't help. Looking through the lettuce code, it seems lettuce calls into 300 redis nodes(primaries and replica) and picks the topology based on which topology view knows about the largest number of existing nodes. I cannot tell if the adaptive trigger timeout works.

Stack trace
// your stack trace here;

Input Code

Input Code
    redisClient.setOptions(

        ClusterClientOptions.builder()
            .autoReconnect(true)
            .requestQueueSize(REQUEST_QUEUE_SIZE)
            .cancelCommandsOnReconnectFailure(true)
            .disconnectedBehavior(ClientOptions.DisconnectedBehavior.REJECT_COMMANDS)
            .topologyRefreshOptions(
                ClusterTopologyRefreshOptions.builder()
                    .enablePeriodicRefresh(false)
                    .enableAllAdaptiveRefreshTriggers()
                    .dynamicRefreshSources(false)
                    .build())
            .timeoutOptions(
                TimeoutOptions.builder()
                    .timeoutCommands(true)
                    .fixedTimeout(Duration.ofMillis(COMMAND_TIMEOUT_MS)) // command timeout
                    .build())
            .build());

Environment

  • Lettuce version(s) : 5.3.7, 6.2.1, 6.2.2, 6.2.3, 6.2.4
  • Redis version: 6.0.16

Additional context

Screenshot 2023-07-26 at 5 12 54 PM

kushwaha0791 avatar Jul 27 '23 00:07 kushwaha0791

@mp911de any suggestions?

kushwaha0791 avatar Aug 01 '23 21:08 kushwaha0791

I also encountered it and the OS indicated a flood attack。 redis version:5.0.14 and lettuce version 6.0.1

1209233066 avatar Aug 16 '23 04:08 1209233066

With dynamicRefreshSources being disabled, Lettuce uses only the provided seed nodes provided in RedisClusterClient.create(…) instead of reaching out to all cluster nodes. These spikes indicate that some event has caused increased topology refreshes.

This view here is pretty high-level, you'd need to investigate on a spike, what has happened, ideally by capturing a debug log from one of the nodes.

mp911de avatar Aug 16 '23 07:08 mp911de

If you would like us to look at this issue, please provide the requested information. If the information is not provided within the next 30 days this issue will be closed.

github-actions[bot] avatar Feb 19 '25 00:02 github-actions[bot]