lettuce icon indicating copy to clipboard operation
lettuce copied to clipboard

Lettuce can't reconnect to another node in cluster

Open AndreyUS opened this issue 2 years ago • 0 comments

Bug Report

Current Behavior

We have Redis Cluster with 3 nodes(master shard, 2-slaves) with proxy mode: all nodes.

When the client connected to proxy that host on slave and when this slave loose connection (blocked by iptable). The client keeps the connection to this dead node for 5 min and only when receiving: "Connection reset by peer" and then the client reconnects immediately to another node.

If proxy on master shard in this case all is good. The client change connection new node

Stack trace
[channel=0x6e5bc2a6, /10.213.150.55:58866 -> host/ip:port, last known addr=host/node-ip-1:11211] channelInactive()[]

Reconnecting, last destination was host/node-ip-1:11211[]

[channel=0xda0be941, /10.213.159.75:59592 -> host/node-ip-2:11211, last known addr=host/node-ip-2:11211] channelActive()[]

Input Code

Here is configuration for redis client

Input Code
    @Bean(destroyMethod = "shutdown")
    @Primary
    public ClientResources redisClientResources(RedisConfigurationProperties properties) {
        var eventEmitInterval = Duration.ofMillis(properties.getMetricsEmitFrequencyInMs());
        return ClientResources.builder()
                              .commandLatencyCollectorOptions(
                                      CommandLatencyCollectorOptions.builder()
                                                                    .targetUnit(TimeUnit.MILLISECONDS)
                                                                    .build())
                              .commandLatencyPublisherOptions(
                                      DefaultEventPublisherOptions.builder()
                                                                  .eventEmitInterval(eventEmitInterval)
                                                                  .build())
                              .build();
    }

    @Bean
    public LettuceClientConfigurationBuilderCustomizer connectTimeoutBuildCustomizer(RedisConfigurationProperties properties) {
        var socketOptions = SocketOptions.builder()
                                         .connectTimeout(Duration.ofMillis(properties.getConnectTimeoutInMs()))
                                         .keepAlive(properties.isKeepAlive())
                                         .keepAlive(SocketOptions.KeepAliveOptions.builder()
                                                                                  .enable()
                                                                                  .idle(Duration.ofSeconds(15))
                                                                                  .interval(Duration.ofSeconds(2))
                                                                                  .count(3)
                                                                                  .build())
                                         .tcpNoDelay(properties.isTcpNoDelay())
                                         .build();
        var adaptiveRefreshTimeout = Duration.ofMillis(properties.getAdaptiveRefreshTriggersTimeoutInMs());
        int refreshTriggersReconnectAttempts = properties.getRefreshTriggersReconnectAttempts();
        var clusterTopologyRefreshOptions = ClusterTopologyRefreshOptions.builder()
                                                                         .enablePeriodicRefresh()
                                                                         .enableAllAdaptiveRefreshTriggers()
                                                                         .adaptiveRefreshTriggersTimeout(adaptiveRefreshTimeout)
                                                                         .refreshTriggersReconnectAttempts(refreshTriggersReconnectAttempts)
                                                                         .build();

        var clientOptions = ClusterClientOptions.builder()
                                                .socketOptions(socketOptions)
                                                .topologyRefreshOptions(clusterTopologyRefreshOptions)
                                                .build();

        return clientConfigurationBuilder -> clientConfigurationBuilder.clientOptions(clientOptions);
    }

Expected behavior/code

I expect when slave node with proxy loose connection the client reconnects to new node without delay for 5 min

Environment

  • Lettuce version(s): 6.1.8.RELEASE
  • Redis version: 6.2.5

AndreyUS avatar Apr 29 '22 13:04 AndreyUS