[BUG] backup restore cluster delete hang
Describe the bug A clear and concise description of what the bug is.
kbcli version
Kubernetes: v1.30.4-vke.10
KubeBlocks: 1.1.0-alpha.3
kbcli: 1.0.1
wait for the workloads to be deleted: map[{workloads.kubeblocks.io/v1, Kind=InstanceSet default/mysql-bk-mysql}:0xc003928708]
To Reproduce Steps to reproduce the behavior:
- create cluster
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
name: mysql-ccxdow
namespace: default
spec:
clusterDef: mysql
topology: semisync
terminationPolicy: WipeOut
componentSpecs:
- name: mysql
serviceVersion: 8.0.30
disableExporter: true
replicas: 2
resources:
limits:
cpu: 100m
memory: 0.5Gi
requests:
cpu: 100m
memory: 0.5Gi
volumeClaimTemplates:
- name: data
spec:
storageClassName:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
kubectl get cluster mysql-ccxdow -w
NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
mysql-ccxdow mysql WipeOut Running 4m20s
- backup
kbcli cluster backup mysql-ccxdow --method xtrabackup
Backup backup-default-mysql-ccxdow-20251104152833 created successfully, you can view the progress:
kbcli cluster list-backups --names=backup-default-mysql-ccxdow-20251104152833 -n default
- restore
kbcli cluster restore mysql-bk --backup backup-default-mysql-ccxdow-20251104152833
Cluster mysql-bk created
kubectl get cluster mysql-bk
NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
mysql-bk mysql WipeOut Running 4m55s
- delete restore cluster
kbcli cluster delete mysql-bk --auto-approve
Cluster mysql-bk deleted
- See error
kubectl get cluster mysql-bk
NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
mysql-bk mysql WipeOut Deleting 10m
➜ ~
➜ ~ kubectl get pod -l app.kubernetes.io/instance=mysql-bk
No resources found in default namespace.
➜ ~
➜ ~ kubectl get cmp -l app.kubernetes.io/instance=mysql-bk
NAME DEFINITION SERVICE-VERSION STATUS AGE
mysql-bk-mysql mysql-8.0-1.1.0-alpha.0 8.0.30 Deleting 11m
➜ ~
➜ ~ kubectl get its -l app.kubernetes.io/instance=mysql-bk
NAME DESIRED UP-TO-DATE READY AVAILABLE AGE
mysql-bk-mysql 2 2 2 10m
➜ ~
➜ ~ kubectl get svc -l app.kubernetes.io/instance=mysql-bk
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
mysql-bk-mysql ClusterIP 10.225.218.96 <none> 3306/TCP 11m
➜ ~
➜ ~ kubectl get cm -l app.kubernetes.io/instance=mysql-bk
NAME DATA AGE
mysql-bk-mysql-env 5 11m
mysql-bk-mysql-haconfig 0 9m1s
mysql-bk-mysql-hahistory 1 6m27s
mysql-bk-mysql-leader 0 9m1s
mysql-bk-mysql-mysql-scripts 6 11m
sidecar-mysql-bk-mysql-config-manager-config 1 11m
➜ ~
➜ ~ kubectl get secret -l app.kubernetes.io/instance=mysql-bk
NAME TYPE DATA AGE
mysql-bk-mysql-account-kbadmin Opaque 2 11m
mysql-bk-mysql-account-kbdataprotection Opaque 2 11m
mysql-bk-mysql-account-kbmonitoring Opaque 2 11m
mysql-bk-mysql-account-kbprobe Opaque 2 11m
mysql-bk-mysql-account-kbreplicator Opaque 2 11m
mysql-bk-mysql-account-proxysql Opaque 2 11m
mysql-bk-mysql-account-root Opaque 2 11m
describe cluster
kubectl describe cluster mysql-bk
Name: mysql-bk
Namespace: default
Labels: clusterdefinition.kubeblocks.io/name=mysql
Annotations: kubeblocks.io/crd-api-version: apps.kubeblocks.io/v1
API Version: apps.kubeblocks.io/v1
Kind: Cluster
Metadata:
Creation Timestamp: 2025-11-04T07:30:39Z
Deletion Grace Period Seconds: 0
Deletion Timestamp: 2025-11-04T07:35:41Z
Finalizers:
cluster.kubeblocks.io/finalizer
Generation: 2
Resource Version: 86704
UID: 740ad715-4abf-4a84-96dc-ed8bddedaeec
Spec:
Cluster Def: mysql
Component Specs:
Component Def: mysql-8.0-1.1.0-alpha.0
Disable Exporter: true
Flat Instance Ordinal: false
Name: mysql
Pod Update Policy: PreferInPlace
Replicas: 2
Resources:
Limits:
Cpu: 100m
Memory: 512Mi
Requests:
Cpu: 100m
Memory: 512Mi
Service Version: 8.0.30
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 20Gi
Termination Policy: WipeOut
Topology: semisync
Status:
Components:
Mysql:
Observed Generation: 1
Phase: Running
Up To Date: true
Conditions:
Last Transition Time: 2025-11-04T07:30:39Z
Message: The operator has started the provisioning of Cluster: mysql-bk
Observed Generation: 1
Reason: PreCheckSucceed
Status: True
Type: ProvisioningStarted
Last Transition Time: 2025-11-04T07:30:39Z
Message: Successfully applied for resources
Observed Generation: 1
Reason: ApplyResourcesSucceed
Status: True
Type: ApplyResources
Last Transition Time: 2025-11-04T07:35:30Z
Message: cluster mysql-bk is ready
Reason: ClusterReady
Status: True
Type: Ready
Observed Generation: 1
Phase: Deleting
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal PreCheckSucceed 11m cluster-controller The operator has started the provisioning of Cluster: mysql-bk
Normal ApplyResourcesSucceed 11m cluster-controller Successfully applied for resources
Normal ClusterComponentPhaseTransition 11m (x2 over 11m) cluster-controller cluster component mysql is Creating
Normal ClusterReady 7m5s cluster-controller cluster mysql-bk is ready
Normal Running 7m5s cluster-controller Cluster: mysql-bk is ready, current phase is Running
Normal ClusterComponentPhaseTransition 6m58s (x6 over 7m5s) cluster-controller cluster component mysql is Running
Normal DeletingCR 6m54s (x3 over 6m54s) cluster-controller Deleting : mysql-bk
logs kubeblocks
➜ ~ kubectl logs -n kb-system kubeblocks-85864d9c7-cql5g|grep "wait for the workloads"|grep mysql-bk-mysql
Defaulted container "manager" out of: manager, tools (init)
2025-11-04T07:35:41.702Z INFO wait for the workloads to be deleted: map[{workloads.kubeblocks.io/v1, Kind=InstanceSet default/mysql-bk-mysql}:0xc003928708] {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"mysql-bk-mysql","namespace":"default"}, "namespace": "default", "name": "mysql-bk-mysql", "reconcileID": "beac1f2e-9b30-4d4d-9ce9-8256dd31b099", "component": {"name":"mysql-bk-mysql","namespace":"default"}}
2025-11-04T07:35:41.787Z INFO wait for the workloads to be deleted: map[{workloads.kubeblocks.io/v1, Kind=InstanceSet default/mysql-bk-mysql}:0xc003c95108] {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"mysql-bk-mysql","namespace":"default"}, "namespace": "default", "name": "mysql-bk-mysql", "reconcileID": "d5227335-f03a-4a2f-9a39-1b638f9fff39", "component": {"name":"mysql-bk-mysql","namespace":"default"}}
2025-11-04T07:35:41.860Z INFO wait for the workloads to be deleted: map[{workloads.kubeblocks.io/v1, Kind=InstanceSet default/mysql-bk-mysql}:0xc003fe8708] {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"mysql-bk-mysql","namespace":"default"}, "namespace": "default", "name": "mysql-bk-mysql", "reconcileID": "5cb551b2-6f6f-4fdb-98c1-796666f41ca4", "component": {"name":"mysql-bk-mysql","namespace":"default"}}
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
starrocks-ce cluster delete hang
- create cluster
apiVersion: apps.kubeblocks.io/v1
kind: Cluster
metadata:
name: strce-shiopn
namespace: default
spec:
clusterDef: starrocks-ce
topology: shared-nothing
terminationPolicy: WipeOut
componentSpecs:
- name: fe
serviceVersion: 3.2.2
disableExporter: true
replicas: 1
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 1000m
memory: 2Gi
volumeClaimTemplates:
- name: data
spec:
storageClassName:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- name: be
serviceVersion: 3.2.2
replicas: 2
resources:
requests:
cpu: 1000m
memory: 2Gi
limits:
cpu: 1000m
memory: 2Gi
volumeClaimTemplates:
- name: data
spec:
storageClassName:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
- delete cluster
kbcli cluster delete strce-shiopn --auto-approve --namespace default
- see error
kubectl get cluster,pod,its,cmp,cm,secret,svc -l app.kubernetes.io/instance=strce-shiopn
NAME CLUSTER-DEFINITION TERMINATION-POLICY STATUS AGE
cluster.apps.kubeblocks.io/strce-shiopn starrocks-ce WipeOut Deleting 36m
NAME DESIRED UP-TO-DATE READY AVAILABLE AGE
instanceset.workloads.kubeblocks.io/strce-shiopn-fe 1 1 1 36m
NAME DEFINITION SERVICE-VERSION STATUS AGE
component.apps.kubeblocks.io/strce-shiopn-fe starrocks-ce-fe-1.1.0-alpha.0 3.2.2 Deleting 36m
NAME DATA AGE
configmap/strce-shiopn-fe-env 1 36m
configmap/strce-shiopn-fe-fe-cm 1 36m
configmap/strce-shiopn-fe-scripts 1 36m
NAME TYPE DATA AGE
secret/strce-shiopn-fe-account-root Opaque 2 36m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/strce-shiopn-fe-fe ClusterIP 10.225.212.252 <none> 8030/TCP,9030/TCP 36m
des cluster
kubectl describe cluster strce-shiopn
Name: strce-shiopn
Namespace: default
Labels: app.kubernetes.io/instance=strce-shiopn
clusterdefinition.kubeblocks.io/name=starrocks-ce
Annotations: kubeblocks.io/crd-api-version: apps.kubeblocks.io/v1
API Version: apps.kubeblocks.io/v1
Kind: Cluster
Metadata:
Creation Timestamp: 2025-11-04T07:17:01Z
Deletion Grace Period Seconds: 0
Deletion Timestamp: 2025-11-04T07:43:42Z
Finalizers:
cluster.kubeblocks.io/finalizer
Generation: 13
Resource Version: 92532
UID: 66b89407-57f2-4c15-b6f5-409b454f44bc
Spec:
Cluster Def: starrocks-ce
Component Specs:
Annotations:
kubeblocks.io/restart: 2025-11-04T07:40:58Z
Component Def: starrocks-ce-fe-1.1.0-alpha.0
Disable Exporter: true
Flat Instance Ordinal: false
Name: fe
Pod Update Policy: PreferInPlace
Replicas: 1
Resources:
Limits:
Cpu: 1100m
Memory: 2254857830400m
Requests:
Cpu: 1100m
Memory: 2254857830400m
Service Version: 3.2.2
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 20Gi
Annotations:
kubeblocks.io/restart: 2025-11-04T07:40:58Z
Component Def: starrocks-ce-be-1.1.0-alpha.0
Flat Instance Ordinal: false
Name: be
Pod Update Policy: PreferInPlace
Replicas: 2
Resources:
Limits:
Cpu: 1100m
Memory: 2254857830400m
Requests:
Cpu: 1100m
Memory: 2254857830400m
Service Version: 3.2.2
Volume Claim Templates:
Name: data
Spec:
Access Modes:
ReadWriteOnce
Resources:
Requests:
Storage: 24Gi
Termination Policy: WipeOut
Topology: shared-nothing
Status:
Components:
Be:
Message:
InstanceSet/strce-shiopn-be: ["strce-shiopn-be-0"]
Observed Generation: 12
Phase: Running
Up To Date: true
Fe:
Observed Generation: 12
Phase: Running
Up To Date: true
Conditions:
Last Transition Time: 2025-11-04T07:17:01Z
Message: The operator has started the provisioning of Cluster: strce-shiopn
Observed Generation: 12
Reason: PreCheckSucceed
Status: True
Type: ProvisioningStarted
Last Transition Time: 2025-11-04T07:17:01Z
Message: Successfully applied for resources
Observed Generation: 12
Reason: ApplyResourcesSucceed
Status: True
Type: ApplyResources
Last Transition Time: 2025-11-04T07:34:00Z
Message: cluster strce-shiopn is ready
Reason: ClusterReady
Status: True
Type: Ready
Observed Generation: 12
Phase: Deleting
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal PreCheckSucceed 37m (x2 over 37m) cluster-controller The operator has started the provisioning of Cluster: strce-shiopn
Normal ApplyResourcesSucceed 37m (x2 over 37m) cluster-controller Successfully applied for resources
Normal ClusterComponentPhaseTransition 34m (x10 over 37m) cluster-controller cluster component fe is Creating
Normal ClusterComponentPhaseTransition 31m (x3 over 33m) cluster-controller cluster component be is Creating
Normal ClusterComponentPhaseTransition 26m (x12 over 28m) cluster-controller cluster component fe is Starting
Normal ClusterComponentPhaseTransition 11m (x106 over 33m) cluster-controller cluster component fe is Running
logs kubeblocks
2025-11-04T07:44:10.926Z INFO reconcile object *v1.InstanceSet with action DELETE OK {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"strce-shiopn-fe","namespace":"default"}, "namespace": "default", "name": "strce-shiopn-fe", "reconcileID": "b5dc4a42-312f-4278-aa25-30d8dca7f574", "component": {"name":"strce-shiopn-fe","namespace":"default"}}
2025-11-04T07:44:10.932Z INFO reconcile object *v1.Component with action STATUS OK {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"strce-shiopn-fe","namespace":"default"}, "namespace": "default", "name": "strce-shiopn-fe", "reconcileID": "b5dc4a42-312f-4278-aa25-30d8dca7f574", "component": {"name":"strce-shiopn-fe","namespace":"default"}}
2025-11-04T07:44:11.840Z INFO wait for the workloads to be deleted: map[{workloads.kubeblocks.io/v1, Kind=InstanceSet default/strce-shiopn-fe}:0xc00155f108] {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"strce-shiopn-fe","namespace":"default"}, "namespace": "default", "name": "strce-shiopn-fe", "reconcileID": "bc494946-4ed1-4e75-941c-220088eaa7a2", "component": {"name":"strce-shiopn-fe","namespace":"default"}}
2025-11-04T07:44:11.840Z INFO reconcile object *v1.InstanceSet with action DELETE OK {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"strce-shiopn-fe","namespace":"default"}, "namespace": "default", "name": "strce-shiopn-fe", "reconcileID": "bc494946-4ed1-4e75-941c-220088eaa7a2", "component": {"name":"strce-shiopn-fe","namespace":"default"}}
2025-11-04T07:44:11.846Z INFO reconcile object *v1.Component with action STATUS OK {"controller": "component", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Component", "Component": {"name":"strce-shiopn-fe","namespace":"default"}, "namespace": "default", "name": "strce-shiopn-fe", "reconcileID": "bc494946-4ed1-4e75-941c-220088eaa7a2", "component": {"name":"strce-shiopn-fe","namespace":"default"}}
2025-11-04T07:44:23.011Z INFO wait for the components and shardings to be deleted: map[fe:{}] {"controller": "cluster", "controllerGroup": "apps.kubeblocks.io", "controllerKind": "Cluster", "Cluster": {"name":"strce-shiopn","namespace":"default"}, "namespace": "default", "name": "strce-shiopn", "reconcileID": "cb9b5491-7091-43e7-a802-ffb96d337aab", "cluster": {"name":"strce-shiopn","namespace":"default"}}
While deleting the cluster, KubeBlocks reports wait for the workloads to be deleted but ITS is never deleted.
By annotating the ITS, it got deleted. It seems deletion of of workload resources failed to notify ITS.
By comparing the PVC( original cluster and retored cluster), we go the key difference: Owner Reference.
- PVC of orignal cluster
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
volume.kubernetes.io/selected-node: kbv110-control-plane
volume.kubernetes.io/storage-provisioner: rancher.io/local-path
creationTimestamp: "2025-11-05T07:13:27Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app.kubernetes.io/component: mysql-8.0-1.1.0-alpha.0
app.kubernetes.io/instance: mysql-ccxdow
app.kubernetes.io/managed-by: kubeblocks
apps.kubeblocks.io/component-name: mysql
apps.kubeblocks.io/pod-name: mysql-ccxdow-mysql-0
apps.kubeblocks.io/release-phase: stable
apps.kubeblocks.io/service-version: 8.0.30
apps.kubeblocks.io/vct-name: data
workloads.kubeblocks.io/instance: mysql-ccxdow-mysql
workloads.kubeblocks.io/managed-by: InstanceSet
name: data-mysql-ccxdow-mysql-0
namespace: default
ownerReferences:
- apiVersion: workloads.kubeblocks.io/v1
blockOwnerDeletion: true
controller: true
kind: InstanceSet
name: mysql-ccxdow-mysql
uid: d9843450-670c-4170-8eec-34270cd26aee
resourceVersion: "1789150"
uid: d207de82-6421-40a6-adc6-3a2467e3486c
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: standard
volumeMode: Filesystem
volumeName: pvc-d207de82-6421-40a6-adc6-3a2467e3486c
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
phase: Bound
- PVC of restored cluster
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: rancher.io/local-path
volume.kubernetes.io/selected-node: kbv110-control-plane
volume.kubernetes.io/storage-provisioner: rancher.io/local-path
creationTimestamp: "2025-11-05T07:11:56Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app.kubernetes.io/component: mysql-8.0-1.1.0-alpha.0
app.kubernetes.io/instance: mysql-bk2
app.kubernetes.io/managed-by: kubeblocks
apps.kubeblocks.io/component-name: mysql
apps.kubeblocks.io/pod-name: mysql-bk2-mysql-0
apps.kubeblocks.io/release-phase: stable
apps.kubeblocks.io/service-version: 8.0.30
apps.kubeblocks.io/vct-name: data
componentdefinition.kubeblocks.io/name: mysql-8.0-1.1.0-alpha.0
workloads.kubeblocks.io/instance: mysql-bk2-mysql
workloads.kubeblocks.io/managed-by: InstanceSet
name: data-mysql-bk2-mysql-0
namespace: default
resourceVersion: "1788354"
uid: 339fbbea-36cd-4233-825f-8045e0e6f8c5
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: standard
volumeMode: Filesystem
volumeName: pvc-339fbbea-36cd-4233-825f-8045e0e6f8c5
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 20Gi
phase: Bound
@leon-inf AFAIC, we solve some issue similar to this one previously. Do you have any idea on this ?