redis-operator
redis-operator copied to clipboard
PVCs remain after deleting RedisCluster object
What version of redis operator are you using?
redis-operator version: 0.15.1
Does this issue reproduce with the latest release? Using the current release 0.15.1
What operating system and processor architecture are you using (kubectl version
)?
kubectl version
Output
$ oc version Client Version: 4.12.26 Kustomize Version: v4.5.7 Server Version: 4.13.27 Kubernetes Version: v1.26.11+8cfd402
What did you do?
- Deploy operator on OpenShift cluster in the openshift-operators namespace
- Create NetworkPolicy resource to allow network traffic from Operator to deployed Redis pods
- Create RedisCluster Custom Resource:
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisCluster
metadata:
name: redis-cluster
spec:
clusterSize: 3
clusterVersion: v7
persistenceEnabled: true
kubernetesConfig:
image: quay.io/opstree/redis:v7.0.12
imagePullPolicy: IfNotPresent
redisExporter:
enabled: true
image: quay.io/opstree/redis-exporter:v1.45.0
imagePullPolicy: IfNotPresent
storage:
keepAfterDelete: false
nodeConfVolume: true
nodeConfVolumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
- (wait for deployment to stabilize. Verify succesful deployment.)
- Delete Custom Resource
$ oc delete RedisCluster redis-cluster
rediscluster.redis.redis.opstreelabs.in "redis-cluster" deleted
- Wait to stabilize. Verify deletion of resource and pods
What did you expect to see?
All related PVCs are deleted along with the Redis Cluster pods
What did you see instead?
- All related PVCs remain:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
node-conf-redis-cluster-follower-0 Bound pvc-4e3d8843-23c7-409d-88d6-3548af3cf9c3 1Gi RWO ocs-storagecluster-ceph-rbd 41m
node-conf-redis-cluster-follower-1 Bound pvc-227e21d7-6390-4a44-9eb3-2d3ae4ed3846 1Gi RWO ocs-storagecluster-ceph-rbd 41m
node-conf-redis-cluster-follower-2 Bound pvc-15f13c4c-953a-4796-a33e-5fc3fe3e558c 1Gi RWO ocs-storagecluster-ceph-rbd 41m
node-conf-redis-cluster-leader-0 Bound pvc-92056c1b-dbb5-4514-9bfc-ea2d2170472e 1Gi RWO ocs-storagecluster-ceph-rbd 41m
node-conf-redis-cluster-leader-1 Bound pvc-88aebee3-69d3-4cc0-b58a-6eb892d180ff 1Gi RWO ocs-storagecluster-ceph-rbd 41m
node-conf-redis-cluster-leader-2 Bound pvc-944a37d6-cee8-412a-80bb-90ab089ea7e5 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-follower-redis-cluster-follower-0 Bound pvc-c824f5b3-1c2a-473d-9354-e2883a5b4229 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-follower-redis-cluster-follower-1 Bound pvc-9668f8ff-b293-44f3-b979-bcfe0b677252 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-follower-redis-cluster-follower-2 Bound pvc-58645cdd-20ad-450e-ae1a-7461ba7b1fe5 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-leader-redis-cluster-leader-0 Bound pvc-19832b28-143e-417b-8b70-f61bffa4055a 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-leader-redis-cluster-leader-1 Bound pvc-6f888a9f-1034-4097-8bfa-7599767eb799 1Gi RWO ocs-storagecluster-ceph-rbd 41m
redis-cluster-leader-redis-cluster-leader-2 Bound pvc-4d792ac7-c3f7-49d5-b930-ad9f9caf79e9 1Gi RWO ocs-storagecluster-ceph-rbd 41m
- Error in Operator log: (log from the moment the deletion was initiated)
{"level":"info","ts":1711636630.8305004,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"lbv-develop","Request.Name":"redis-cluster"}
{"level":"error","ts":1711636630.9304478,"logger":"controller.rediscluster","msg":"Reconciler error","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"RedisCluster","name":"redis-cluster","namespace":"lbv-develop","error":"Operation cannot be fulfilled on redisclusters.redis.redis.opstreelabs.in \"redis-cluster\": StorageError: invalid object, Code: 4, Key: /kubernetes.io/redis.redis.opstreelabs.in/redisclusters/lbv-develop/redis-cluster, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 64d37913-b6ff-43fd-90d0-4c792c0d7111, UID in object meta: ","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
{"level":"info","ts":1711636630.9305146,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"lbv-develop","Request.Name":"redis-cluster"}
{"level":"info","ts":1711636630.935724,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"lbv-develop","Request.Name":"redis-cluster"}
Thank you for your feedback, @solipsistadventurer!
While it may not have happened yet, we conduct end-to-end tests to guarantee that the PVC is deleted after the custom resource is removed.
Could you share the PVC YAML with us after deleting the Redis cluster? For example, you can use kubectl get pvc redis-cluster-leader-redis-cluster-leader-0 -oyaml
.
Hi @drivebyer ,
Here is the yaml you requested:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
redis.opstreelabs.in: "true"
redis.opstreelabs.instance: redis-cluster
volume.beta.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com
volume.kubernetes.io/storage-provisioner: openshift-storage.rbd.csi.ceph.com
creationTimestamp: "2024-03-29T13:27:48Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app: redis-cluster-leader
redis_setup_type: cluster
role: leader
name: redis-cluster-leader-redis-cluster-leader-0
namespace: cet-lbv-lucas-develop
resourceVersion: "116051541"
uid: da289a0e-5b79-415c-a338-546880e08f36
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: ocs-storagecluster-ceph-rbd
volumeMode: Filesystem
volumeName: pvc-da289a0e-5b79-415c-a338-546880e08f36
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
phase: Bound
I use
apiVersion: redis.redis.opstreelabs.in/v1beta2 kind: RedisCluster metadata: name: redis-cluster spec: clusterSize: 3 clusterVersion: v7 persistenceEnabled: true kubernetesConfig: image: quay.io/opstree/redis:v7.0.12 imagePullPolicy: IfNotPresent redisExporter: enabled: true image: quay.io/opstree/redis-exporter:v1.45.0 imagePullPolicy: IfNotPresent storage: keepAfterDelete: false nodeConfVolume: true nodeConfVolumeClaimTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi volumeClaimTemplate: spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi
I using this YAML with the master branch. I noticed that the PVC was deleted after deleting the custom resource. My suggestion is to make sure to check if there are finalizers named redisClusterFinalizer in the rediscluster before you delete it. @solipsistadventurer
@drivebyer Thanks for your response.
There appears to be a redisClusterFinalizer present in the RedisCluster custom resource.
However, something that I did not mention before is that the resource appears to be stuck in the 'Bootstrap' state.
The actual Redis cluster appears to work normally, and I do not see anything that would indicate that something out of the ordinary is happening.
There is one thing in the Operator log that I do not understand: it repeatedly logs the message Redis leader count is desired
. Do you know what this means and how I could solve it? I wonder if this could be related this PVC issue.
"Stabilized" RedisCluster resource:
oc get RedisCluster redis-cluster -o yaml
apiVersion: v1
items:
- apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisCluster
metadata:
creationTimestamp: "2024-03-29T20:23:01Z"
finalizers:
- redisClusterFinalizer
generation: 2
name: redis-cluster
namespace: cet-lbv-lucas-develop
resourceVersion: "116825602"
uid: 1ed262c3-4a00-4061-89fd-f9f0c44a1cd8
spec:
clusterSize: 3
clusterVersion: v7
kubernetesConfig:
image: cir-cn.chp.belastingdienst.nl/external/quay.io/opstree/redis:v7.0.12
imagePullPolicy: IfNotPresent
updateStrategy: {}
persistenceEnabled: true
redisExporter:
enabled: true
image: cir-cn.chp.belastingdienst.nl/external/quay.io/opstree/redis-exporter:v1.45.0
imagePullPolicy: IfNotPresent
redisFollower:
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
redisLeader:
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 1
storage:
nodeConfVolume: true
nodeConfVolumeClaimTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
status: {}
volumeClaimTemplate:
metadata: {}
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
status: {}
volumeMount: {}
status:
readyFollowerReplicas: 3
readyLeaderReplicas: 3
reason: RedisCluster is bootstrapping
state: Bootstrap
kind: List
metadata:
resourceVersion: ""
Operator log (section repeats)
{"level":"info","ts":1711744702.0825884,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"cet-lbv-lucas-develop","Request.Name":"redis-cluster"}
{"level":"info","ts":1711744702.1418211,"logger":"controller_redis","msg":"Redis PodDisruptionBudget get action failed","Request.PodDisruptionBudget.Namespace":"cet-lbv-lucas-develop","Request.PodDisruptionBudget.Name":"redis-cluster-leader"}
{"level":"info","ts":1711744702.3332732,"logger":"controller_redis","msg":"Redis PodDisruptionBudget get action failed","Request.PodDisruptionBudget.Namespace":"cet-lbv-lucas-develop","Request.PodDisruptionBudget.Name":"redis-cluster-follower"}
{"level":"info","ts":1711744702.337287,"logger":"controllers.RedisCluster","msg":"Creating redis cluster by executing cluster creation commands","Request.Namespace":"cet-lbv-lucas-develop","Request.Name":"redis-cluster","Leaders.Ready":"3","Followers.Ready":"3"}
{"level":"info","ts":1711744702.3477137,"logger":"controllers.RedisCluster","msg":"Redis leader count is desired","Request.Namespace":"cet-lbv-lucas-develop","Request.Name":"redis-cluster"}
@solipsistadventurer, The Bootstrap
state , It appears that the Redis cluster might not be ready yet. You can check this by running redis-cli --cluster check 127.0.0.1:6379
to see if the cluster has been created properly.
@drivebyer, the cluster seems fine. Until now I have not found any indication of issues in the actual redis cluster
Nonetheless, after many hours the RedisCluster state is still bootstrap
I will be happy to troubleshoot the issue further. For example, I would love to know why the bootstrap hangs, or what causes the error in the Operator when deleting the resource, and whether these issues are actually related to the PVCs not getting deleted.
However I would appreciate some pointers where to look.
$ redis-cli --cluster check 127.0.0.1:6379
127.0.0.1:6379 (9866b447...) -> 0 keys | 5461 slots | 1 slaves.
10.129.5.84:6379 (d3a27680...) -> 1 keys | 5461 slots | 1 slaves.
10.131.3.125:6379 (22224200...) -> 0 keys | 5462 slots | 1 slaves.
[OK] 1 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 127.0.0.1:6379)
M: 9866b44726e31a5269e97636e9aae937048b5478 127.0.0.1:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: f4d51617299f8560647b7a72a0e5e31f812c5691 10.128.3.246:6379
slots: (0 slots) slave
replicates 2222420031cf4519527aa3c01f9712b86a7ef1c3
M: d3a276808b77fd14ca1b74e42d47c2b4d4056d97 10.129.5.84:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: 2222420031cf4519527aa3c01f9712b86a7ef1c3 10.131.3.125:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: 7cf070e3616b301432a2d2ae454aa2f9c9682ccf 10.129.5.85:6379
slots: (0 slots) slave
replicates d3a276808b77fd14ca1b74e42d47c2b4d4056d97
S: fe66201e5e29a97ad931499ee6c82b5261e46592 10.130.6.18:6379
slots: (0 slots) slave
replicates 9866b44726e31a5269e97636e9aae937048b5478
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
Hi, I have the same PVC issue.
K8s version: v1.25.16 redis-operator version: v0.15.1
rediscluster object:
---
apiVersion: redis.redis.opstreelabs.in/v1beta2
kind: RedisCluster
metadata:
name: redis-cluster
spec:
clusterSize: 3
clusterVersion: v7
persistenceEnabled: true
podSecurityContext:
runAsUser: 1000
fsGroup: 1000
kubernetesConfig:
image: registry.xyz.zone/external/quay.io-opstree-redis:v7.2.3
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 101m
memory: 128Mi
limits:
cpu: 101m
memory: 128Mi
redisLeader:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis-cluster-leader
topologyKey: kubernetes.io/hostname
pdb:
enabled: true
maxUnavailable: 1
redisFollower:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app
operator: In
values:
- redis-cluster-follower
topologyKey: kubernetes.io/hostname
pdb:
enabled: true
maxUnavailable: 1
redisExporter:
enabled: true
image: registry.xyz.zone/external/quay.io-opstree-redis-exporter:v1.44.0
imagePullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 100m
memory: 128Mi
storage:
volumeClaimTemplate:
spec:
storageClassName: ceph-rbd-sc
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 1Gi
pods state:
k get pod
NAME READY STATUS RESTARTS AGE
redis-cluster-follower-0 2/2 Running 0 103s
redis-cluster-follower-1 2/2 Running 0 90s
redis-cluster-follower-2 2/2 Running 0 83s
redis-cluster-leader-0 2/2 Running 0 2m24s
redis-cluster-leader-1 2/2 Running 0 2m18s
redis-cluster-leader-2 2/2 Running 0 2m13s
cluster state:
kubectl exec -it redis-cluster-leader-0 -- redis-cli --cluster check 127.0.0.1:6379
Defaulted container "redis-cluster-leader" out of: redis-cluster-leader, redis-exporter
127.0.0.1:6379 (e608995f...) -> 0 keys | 5461 slots | 1 slaves.
10.0.6.20:6379 (ef6e5170...) -> 0 keys | 5461 slots | 1 slaves.
10.0.1.116:6379 (1a2b80b5...) -> 0 keys | 5462 slots | 1 slaves.
[OK] 0 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 127.0.0.1:6379)
M: e608995f6a64d719522fad805729f363bf052612 127.0.0.1:6379
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 5d97d9d0be9cdf5c3c5a54357d99e5e3ac1acfe1 10.0.1.247:6379
slots: (0 slots) slave
replicates ef6e5170a179d82bd82515569e6454021c564c45
M: ef6e5170a179d82bd82515569e6454021c564c45 10.0.6.20:6379
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
M: 1a2b80b5e51dc2f49f758569594a9912434acc9c 10.0.1.116:6379
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: a297d118045f111c69be46d840515237bb3ac925 10.0.1.55:6379
slots: (0 slots) slave
replicates e608995f6a64d719522fad805729f363bf052612
S: ec31b2c29bb2482f74f6f507b6830c3f2d3cb503 10.0.6.21:6379
slots: (0 slots) slave
replicates 1a2b80b5e51dc2f49f758569594a9912434acc9c
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
finalizer:
k get rediscluster redis-cluster -o jsonpath="{.metadata.finalizers}"
["redisClusterFinalizer"]
PVCs still there after deletion of rediscluster object:
# k delete rediscluster redis-cluster
rediscluster.redis.redis.opstreelabs.in "redis-cluster" deleted
# k get pod
No resources found in redis-test namespace.
# k get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
redis-cluster-follower-redis-cluster-follower-0 Bound pvc-8ded2ad2-92ea-4c45-b01b-332a208163f9 1Gi RWO ceph-rbd-sc 5m35s
redis-cluster-follower-redis-cluster-follower-1 Bound pvc-46e22626-e1af-49d4-81a1-0ddedef2d288 1Gi RWO ceph-rbd-sc 5m22s
redis-cluster-follower-redis-cluster-follower-2 Bound pvc-9ef67035-c71a-4120-b58b-923f3afc50c4 1Gi RWO ceph-rbd-sc 5m15s
redis-cluster-leader-redis-cluster-leader-0 Bound pvc-7fedb0fc-ad53-40ec-859f-8695babbe344 1Gi RWO ceph-rbd-sc 7m7s
redis-cluster-leader-redis-cluster-leader-1 Bound pvc-a8b6a0ff-a5ac-4017-b022-59fdb61a4573 1Gi RWO ceph-rbd-sc 6m10s
redis-cluster-leader-redis-cluster-leader-2 Bound pvc-60a67886-acf0-490b-b38a-8c7a34c5200f 1Gi RWO ceph-rbd-sc 6m5s
PVC info:
# k get pvc redis-cluster-leader-redis-cluster-leader-0 -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
redis.opstreelabs.in: "true"
redis.opstreelabs.instance: redis-cluster
volume.beta.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
volume.kubernetes.io/storage-provisioner: rbd.csi.ceph.com
creationTimestamp: "2024-05-10T06:32:52Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app: redis-cluster-leader
redis_setup_type: cluster
role: leader
name: redis-cluster-leader-redis-cluster-leader-0
namespace: redis-test
resourceVersion: "9212791"
uid: 7fedb0fc-ad53-40ec-859f-8695babbe344
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: ceph-rbd-sc
volumeMode: Filesystem
volumeName: pvc-7fedb0fc-ad53-40ec-859f-8695babbe344
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 1Gi
phase: Bound
operator log:
2024-05-10T06:39:21.086675871Z {"level":"info","ts":1715323161.0864742,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"redis-test","Request.Name":"redis-cluster"}
2024-05-10T06:39:21.153586093Z {"level":"info","ts":1715323161.1534524,"logger":"controllers.RedisCluster","msg":"Creating redis cluster by executing cluster creation commands","Request.Namespace":"redis-test","Request.Name":"redis-cluster","Leaders.Ready":"3","Followers.Ready":"3"}
2024-05-10T06:39:21.159934641Z {"level":"info","ts":1715323161.1598077,"logger":"controllers.RedisCluster","msg":"Redis leader count is desired","Request.Namespace":"redis-test","Request.Name":"redis-cluster"}
2024-05-10T06:39:49.284399463Z {"level":"info","ts":1715323189.2842412,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"redis-test","Request.Name":"redis-cluster"}
2024-05-10T06:39:49.321405415Z {"level":"error","ts":1715323189.3212578,"logger":"controller.rediscluster","msg":"Reconciler error","reconciler group":"redis.redis.opstreelabs.in","reconciler kind":"RedisCluster","name":"redis-cluster","namespace":"redis-test","error":"Operation cannot be fulfilled on redisclusters.redis.redis.opstreelabs.in \"redis-cluster\": StorageError: invalid object, Code: 4, Key: /registry/redis.redis.opstreelabs.in/redisclusters/redis-test/redis-cluster, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 63f33fb0-8958-44d0-a602-2ad2252cb2c2, UID in object meta: ","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227"}
2024-05-10T06:39:49.321444629Z {"level":"info","ts":1715323189.321347,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"redis-test","Request.Name":"redis-cluster"}
2024-05-10T06:39:49.326749558Z {"level":"info","ts":1715323189.326646,"logger":"controllers.RedisCluster","msg":"Reconciling opstree redis Cluster controller","Request.Namespace":"redis-test","Request.Name":"redis-cluster"}
fixed by https://github.com/OT-CONTAINER-KIT/redis-operator/pull/703