redis-operator
redis-operator copied to clipboard
rfr-redisfailove Readiness probe failed
Expected behaviour
The rfr-redisfailover are running.
Actual behaviour
The rfr-redisfailover are not running.
ubectl get all
NAME READY STATUS RESTARTS AGE
pod/redisoperator-78c9d88948-555wz 1/1 Running 0 95m
pod/rfr-redisfailover-0 0/1 Running 0 61s
pod/rfr-redisfailover-1 0/1 Running 0 61s
pod/rfr-redisfailover-2 0/1 Running 0 61s
pod/rfs-redisfailover-6b8648d584-r6qtn 1/1 Running 0 61s
pod/rfs-redisfailover-6b8648d584-stjb5 1/1 Running 0 61s
pod/rfs-redisfailover-6b8648d584-zln7d 1/1 Running 0 61s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 110m
service/rfs-redisfailover ClusterIP 10.107.189.183 <none> 26379/TCP 61s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/redisoperator 1/1 1 1 95m
deployment.apps/rfs-redisfailover 3/3 3 3 61s
NAME DESIRED CURRENT READY AGE
replicaset.apps/redisoperator-78c9d88948 1 1 1 95m
replicaset.apps/rfs-redisfailover-6b8648d584 3 3 3 61s
NAME READY AGE
statefulset.apps/rfr-redisfailover 0/3 61s
Steps to reproduce the behaviour
kubectl apply -f basic.yaml
apiVersion: databases.spotahome.com/v1
kind: RedisFailover
metadata:
name: redisfailover
spec:
sentinel:
replicas: 3
resources:
requests:
cpu: 100m
limits:
memory: 100Mi
redis:
replicas: 3
resources:
requests:
cpu: 100m
memory: 100Mi
limits:
cpu: 400m
memory: 500Mi
auth:
secretPath: redis-auth
Environment
How are the pieces configured?
- Redis Operator version =1.2.0
- Kubernetes version =1.22
- Kubernetes configuration used (eg: Is RBAC active?)
Logs
kubectl describe pod/rfr-redisfailover-0
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m3s default-scheduler Successfully assigned default/rfr-redisfailover-0 to node01
Normal Pulling 2m1s kubelet Pulling image "redis:6.2.6-alpine"
Normal Pulled 115s kubelet Successfully pulled image "redis:6.2.6-alpine" in 5.807746669s
Normal Created 115s kubelet Created container redis
Normal Started 115s kubelet Started container redis
Warning Unhealthy 2s (x10 over 83s) kubelet Readiness probe failed:
kubectl logs pod/rfr-redisfailover-0
1:C 15 Sep 2022 17:10:05.475 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 15 Sep 2022 17:10:05.475 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 15 Sep 2022 17:10:05.475 # Configuration loaded
1:S 15 Sep 2022 17:10:05.475 * monotonic clock: POSIX clock_gettime
1:S 15 Sep 2022 17:10:05.476 * Running mode=standalone, port=6379.
1:S 15 Sep 2022 17:10:05.476 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 15 Sep 2022 17:10:05.476 # Server initialized
1:S 15 Sep 2022 17:10:05.476 * Ready to accept connections
1:S 15 Sep 2022 17:10:05.477 * Connecting to MASTER 127.0.0.1:6379
1:S 15 Sep 2022 17:10:05.477 * MASTER <-> REPLICA sync started
1:S 15 Sep 2022 17:10:05.477 * Non blocking connect for SYNC fired the event.
1:S 15 Sep 2022 17:10:05.477 * Master replied to PING, replication can continue...
1:S 15 Sep 2022 17:10:05.478 * Partial resynchronization not possible (no cached master)
1:S 15 Sep 2022 17:10:05.478 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
1:S 15 Sep 2022 17:10:06.480 * Connecting to MASTER 127.0.0.1:6379
1:S 15 Sep 2022 17:10:06.480 * MASTER <-> REPLICA sync started
1:S 15 Sep 2022 17:10:06.480 * Non blocking connect for SYNC fired the event.
1:S 15 Sep 2022 17:10:06.480 * Master replied to PING, replication can continue...
1:S 15 Sep 2022 17:10:06.480 * Partial resynchronization not possible (no cached master)
1:S 15 Sep 2022 17:10:06.480 * Master is currently unable to PSYNC but should be in the future: -NOMASTERLINK Can't SYNC while not connected with my master
...
kubectl logs rfs-redisfailover-6b8648d584-r6qtn
1:X 15 Sep 2022 17:10:08.499 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:X 15 Sep 2022 17:10:08.499 # Redis version=6.2.6, bits=64, commit=00000000, modified=0, pid=1, just started
1:X 15 Sep 2022 17:10:08.499 # Configuration loaded
1:X 15 Sep 2022 17:10:08.499 * monotonic clock: POSIX clock_gettime
1:X 15 Sep 2022 17:10:08.500 * Running mode=sentinel, port=26379.
1:X 15 Sep 2022 17:10:08.500 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:X 15 Sep 2022 17:10:08.510 # Sentinel ID is f5036ba023ce94ab259588a3d84e8997af7d835e
1:X 15 Sep 2022 17:10:08.510 # +monitor master mymaster 127.0.0.1 6379 quorum 2
1:X 15 Sep 2022 17:10:09.512 # +sdown master mymaster 127.0.0.1 6379
It seems all the rfr-redisfailover are slave now.
The issue is similar to #412 . But I still cannot found the solution.
Could you paste if there is some relevant log in the operator? redisoperator-78c9d88948-555wz
time="2022-09-15T15:35:53Z" level=info msg="Listening on :9710 for metrics exposure" src="asm_amd64.s:1581"
time="2022-09-15T15:35:53Z" level=info msg="starting controller" controller-id=redisfailover operator=redisfailover service=kooper.controller src="controller.go:233"
time="2022-09-15T15:36:26Z" level=info msg="service created" namespace=default service=k8s.service serviceName=rfs-redisfailover src="service.go:61"
time="2022-09-15T15:36:27Z" level=info msg="configMap created" configMap=rfs-redisfailover namespace=default service=k8s.configMap src="configmap.go:68"
time="2022-09-15T15:36:27Z" level=info msg="configMap created" configMap=rfr-s-redisfailover namespace=default service=k8s.configMap src="configmap.go:68"
time="2022-09-15T15:36:27Z" level=info msg="configMap created" configMap=rfr-readiness-redisfailover namespace=default service=k8s.configMap src="configmap.go:68"
time="2022-09-15T15:36:27Z" level=info msg="configMap created" configMap=rfr-redisfailover namespace=default service=k8s.configMap src="configmap.go:68"
W0915 15:36:27.146989 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:36:27.151321 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:36:27Z" level=info msg="podDisruptionBudget created" namespace=default podDisruptionBudget=rfr-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:69"
time="2022-09-15T15:36:27Z" level=info msg="statefulSet created" namespace=default service=k8s.statefulSet src="statefulset.go:92" statefulSet=rfr-redisfailover
W0915 15:36:27.344713 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:36:27.347547 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:36:27Z" level=info msg="podDisruptionBudget created" namespace=default podDisruptionBudget=rfs-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:69"
time="2022-09-15T15:36:27Z" level=info msg="deployment created" deployment=rfs-redisfailover namespace=default service=k8s.deployment src="deployment.go:92"
time="2022-09-15T15:36:53Z" level=info msg="configMap updated" configMap=rfs-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:36:53Z" level=info msg="configMap updated" configMap=rfr-s-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:36:53Z" level=info msg="configMap updated" configMap=rfr-readiness-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:36:53Z" level=info msg="configMap updated" configMap=rfr-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
W0915 15:36:53.606013 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:36:53.610646 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:36:53Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfr-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:79"
time="2022-09-15T15:36:53Z" level=info msg="statefulSet updated" namespace=default service=k8s.statefulSet src="statefulset.go:102" statefulSet=rfr-redisfailover
W0915 15:36:53.618265 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:36:53.620573 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:36:53Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfs-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:79"
time="2022-09-15T15:36:53Z" level=info msg="deployment updated" deployment=rfs-redisfailover namespace=default service=k8s.deployment src="deployment.go:102"
time="2022-09-15T15:36:58Z" level=error msg="error on object processing: dial tcp 192.168.140.66:6379: i/o timeout" controller-id=redisfailover object-key=default/redisfailover operator=redisfailover service=kooper.controller src="controller.go:279"
time="2022-09-15T15:37:23Z" level=info msg="configMap updated" configMap=rfs-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:37:23Z" level=info msg="configMap updated" configMap=rfr-s-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:37:23Z" level=info msg="configMap updated" configMap=rfr-readiness-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
time="2022-09-15T15:37:23Z" level=info msg="configMap updated" configMap=rfr-redisfailover namespace=default service=k8s.configMap src="configmap.go:78"
W0915 15:37:23.656240 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:37:23.658631 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:37:23Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfr-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:79"
time="2022-09-15T15:37:23Z" level=info msg="statefulSet updated" namespace=default service=k8s.statefulSet src="statefulset.go:102" statefulSet=rfr-redisfailover
W0915 15:37:23.665736 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
W0915 15:37:23.744248 1 warnings.go:70] policy/v1beta1 PodDisruptionBudget is deprecated in v1.21+, unavailable in v1.25+; use policy/v1 PodDisruptionBudget
time="2022-09-15T15:37:23Z" level=info msg="podDisruptionBudget updated" namespace=default podDisruptionBudget=rfs-redisfailover service=k8s.podDisruptionBudget src="poddisruptionbudget.go:79"
time="2022-09-15T15:37:23Z" level=info msg="deployment updated" deployment=rfs-redisfailover namespace=default service=k8s.deployment src="deployment.go:102"
time="2022-09-15T15:37:28Z" level=error msg="error on object processing: dial tcp 192.168.140.66:6379: i/o timeout" controller-id=redisfailover object-key=default/redisfailover operator=redisfailover service=kooper.controller src="controller.go:279
@ese
It seems redis-operator
cannot connect to redis instance to configure it
time="2022-09-15T15:37:28Z" level=error msg="error on object processing: dial tcp 192.168.140.66:6379: i/o timeout" controller-id=redisfailover object-key=default/redisfailover operator=redisfailover service=kooper.controller src="controller.go:279
Redis instances bootstrap as slave of their self until Redis-operator takes control to configure the cluster. Readiness probe will fail until they are configured by redis-operator
What kind of Kubernetes deploy are you using? GKE, kind, kops,..? What CNI are you using? Have you any network policy in place?
I also ran into this issue because of a network policy. Maybe the status could be reported back onto the RedisFailover resource for better transparency. Currently the timeout is also configured at 30s. Maybe decreasing that makes it easier visible. A 10s timeout would be good enough already.
time="2022-10-11T10:47:29Z" level=error msg="error on object processing: dial tcp 10.42.0.66:6379: i/o timeout" controller-id=redisfailover object-key=somenamespace/redis operator=redisfailover service=kooper.controller src="controller.go:279"
I have the same problem. rfr-service doesn't get deployed by the operator and thus, isn't found => which logs errors
This issue is stale because it has been open for 45 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.
bump
Same issue.