predixy icon indicating copy to clipboard operation
predixy copied to clipboard

ERR no server connection avaliable

Open zkl94 opened this issue 5 years ago • 7 comments

你好,我现在有redis cluster部署在k8s上,statefulset部署,版本5.0.9-alpine,用predixy作为proxy。predixy一开始运行良好,但是过一段时间后日志内就会大量报错,甚至直接拒绝连接 ERR no server connection avaliable,需要重启pod才能把error消除。但是过一段时间后error又会重现。 查询了一下之前的issue,发现其他人也有类似的问题,但似乎这个这个问题还没解决。请问这个问题如何解决?

以下是部分日志:

2020-09-28 18:19:43.269529 E Handler.cpp:437 h 1 s cnx-bionix-facial-monitor-redis-4.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 19 will be close with status 4 EventError
2020-09-28 18:19:43.269557 E Handler.cpp:437 h 1 s 10.187.144.105:6379 42 will be close with status 4 EventError
2020-09-28 18:19:43.269558 E Handler.cpp:437 h 1 s 10.187.144.46:6379 45 will be close with status 4 EventError
2020-09-28 18:19:43.269559 E Handler.cpp:437 h 1 s 10.187.144.26:6379 78 will be close with status 4 EventError
2020-09-28 18:19:43.269617 N Handler.cpp:264 server cnx-bionix-facial-monitor-redis-4.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 mark failure
2020-09-28 18:19:43.269644 N Handler.cpp:275 h 1 close s cnx-bionix-facial-monitor-redis-4.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 19 and c None -1 with status 4 EventError
2020-09-28 18:19:43.269659 N Handler.cpp:264 server 10.187.144.105:6379 mark failure
2020-09-28 18:19:43.269670 N Handler.cpp:275 h 1 close s 10.187.144.105:6379 42 and c None -1 with status 4 EventError
2020-09-28 18:19:43.269676 N Handler.cpp:264 server 10.187.144.46:6379 mark failure
2020-09-28 18:19:43.269677 N Handler.cpp:275 h 1 close s 10.187.144.46:6379 45 and c None -1 with status 4 EventError
2020-09-28 18:19:43.269682 N Handler.cpp:264 server 10.187.144.26:6379 mark failure
2020-09-28 18:19:43.269683 N Handler.cpp:275 h 1 close s 10.187.144.26:6379 78 and c None -1 with status 4 EventError
2020-09-28 18:19:43.269805 E Handler.cpp:437 h 1 s cnx-bionix-facial-monitor-redis-1.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 16 will be close with status 4 EventError
2020-09-28 18:19:43.269821 N Handler.cpp:264 server cnx-bionix-facial-monitor-redis-1.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 mark failure
2020-09-28 18:19:43.269823 N Handler.cpp:275 h 1 close s cnx-bionix-facial-monitor-redis-1.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379 16 and c None -1 with status 4 EventError
2020-09-28 18:19:43.269925 E Handler.cpp:437 h 0 s 10.187.145.109:6379 36 will be close with status 4 EventError
2020-09-28 18:19:43.269940 N Handler.cpp:264 server 10.187.145.109:6379 mark failure
2020-09-28 18:19:43.269941 N Handler.cpp:275 h 0 close s 10.187.145.109:6379 36 and c None -1 with status 4 EventError
2020-09-28 18:19:43.270104 E Handler.cpp:437 h 1 s 10.187.145.170:6379 40 will be close with status 4 EventError
2020-09-28 18:19:43.270126 N Handler.cpp:264 server 10.187.145.170:6379 mark failure
2020-09-28 18:19:43.270135 N Handler.cpp:275 h 1 close s 10.187.145.170:6379 40 and c None -1 with status 4 EventError
2020-09-28 18:19:43.270254 N Logger.cpp:152 MissLog count 3

predixy.conf:

predixy.conf: |
    Name redisproxy
    Bind 0.0.0.0:6379
    WorkerThreads 2
    MaxMemory 20G
    ClientTimeout 300
    BufSize 65536
    LogDebugSample 0
    Authority {
        Auth xxxxxxxxx {
            Mode admin
        }
    }
    ClusterServerPool {
        Password xxxxxxxxx
        MasterReadPriority 60
        StaticSlaveReadPriority 50
        DynamicSlaveReadPriority 50
        RefreshInterval 1
        ServerTimeout 30
        ServerFailureLimit 10
        ServerRetryTimeout 1
        KeepAlive 120
        Servers {
          + cnx-bionix-facial-monitor-redis-0.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-1.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-2.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-3.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-4.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-5.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-6.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-7.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-8.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
          + cnx-bionix-facial-monitor-redis-9.cnx-bionix-facial-monitor-redis-headless.cnx-bionix-facial-monitor.svc.cluster.local:6379
        }
    }

zkl94 avatar Sep 28 '20 20:09 zkl94

你贴的issue链接不是说是pod删除导致ip变化导致么

caojiajun avatar Sep 29 '20 01:09 caojiajun

你贴的issue链接不是说是pod删除导致ip变化导致么

这点我是理解的,但是k8s上pod重启是常态,而且我用的也是pod的dns地址而非直接ip了,按理说predixy是应该能够自动处理failed的pod,自动mark failure吧。请问是这样吗?如果是的话为什么我还得到no server connection avaliable呢?谢谢

zkl94 avatar Sep 29 '20 07:09 zkl94

这个得看看predixy的重连机制了,可以等作者来回答一下。。。

如果有兴趣,你也可以用用另外一款redis cluster proxy: https://github.com/netease-im/camellia/blob/master/docs/redis-proxy/redis-proxy.md 这个proxy在后端连不上的情况下会尝试触发一下renew列表的操作

caojiajun avatar Sep 29 '20 07:09 caojiajun

同样的问题。跪求solution。。

YuhuaDeng avatar Sep 29 '20 22:09 YuhuaDeng

遇到了同样的问题,在k8s集群中部署的。有任何solution吗?

StevenLeiZhang avatar Mar 25 '21 09:03 StevenLeiZhang

同样的问题。跪求solution。。

大佬,你用了下面提供的方法了么?下面这个大佬修复的方案ok不?

wyl9527 avatar Jun 20 '22 06:06 wyl9527

有修复方案么?

1191681612 avatar Feb 28 '24 01:02 1191681612