karmada icon indicating copy to clipboard operation
karmada copied to clipboard

Enhanced forced deletion on karmadactl unjoin

Open zhzhuang-zju opened this issue 2 years ago • 5 comments

What type of PR is this? /kind feature

What this PR does / why we need it: When karmadactl unjoin a member cluster, the unjoin may fail all the time because the resource deletion of the member cluster fails. By enhancing the forced deletion on karmadactl unjoin, if the resource deletion of the member cluster fails also force unjoin Which issue(s) this PR fixes: Fixes #4431

Special notes for your reviewer:

  1. delete clusterRole of member1
➜  karmada git:(master) km1 delete clusterrole karmada-controller-manager:karmada-member1 
clusterrole.rbac.authorization.k8s.io "karmada-controller-manager:karmada-member1" deleted
  1. karmadactl unjoin member1
➜  karmada git:(master) karmadactl unjoin member1 --cluster-kubeconfig=/root/.kube/members.config
I1219 15:56:28.602959   54698 unjoin.go:264] Waiting for the cluster object member1 to be deleted
...
I1219 15:57:27.605630   54698 unjoin.go:264] Waiting for the cluster object member1 to be deleted
E1219 15:57:27.605651   54698 unjoin.go:268] Failed to delete cluster object. cluster name: member1, error: timed out waiting for the condition
E1219 15:57:27.605675   54698 unjoin.go:191] Failed to delete cluster object. cluster name: member1, error: timed out waiting for the condition
error: timed out waiting for the condition
  1. karmadactl unjoin member1 --force
➜  karmada git:(master) karmadactl unjoin member1 --cluster-kubeconfig=/root/.kube/members.config --force
I1219 15:57:48.816159   54808 unjoin.go:264] Waiting for the cluster object member1 to be deleted
...
I1219 15:58:47.822262   54808 unjoin.go:264] Waiting for the cluster object member1 to be deleted
E1219 15:58:47.822290   54808 unjoin.go:268] Failed to delete cluster object. cluster name: member1, error: timed out waiting for the condition
E1219 15:58:47.822305   54808 unjoin.go:191] Failed to delete cluster object. cluster name: member1, error: timed out waiting for the condition
I1219 15:58:47.822310   54808 unjoin.go:193] Start forced deletion by remove work finalizer. cluster name: member1
I1219 15:58:47.847819   54808 unjoin.go:200] Succeeded to remove work's finalizer. After confirming the success of the unjoin, manually delete remaining resources on the cluster member1.
➜  karmada git:(master) ka get cluster
NAME      VERSION   MODE   READY   AGE
member2   v1.27.3   Push   True    4h3m
member3   v1.27.3   Pull   True    4h3m
  1. rejoin
➜  karmada git:(master) karmadactl join member1 --cluster-kubeconfig=/root/.kube/members.config             
cluster(member1) is joined successfully
➜  karmada git:(master) ka get cluster
NAME      VERSION   MODE   READY   AGE
member1   v1.27.3   Push   True    5s
member2   v1.27.3   Push   True    4h15m
member3   v1.27.3   Pull   True    4h15m

Does this PR introduce a user-facing change?:

`karmadactl unjoin`: force unjoin member clusters with the flag `--force`

zhzhuang-zju avatar Dec 19 '23 08:12 zhzhuang-zju

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 1.38889% with 71 lines in your changes missing coverage. Please review.

Project coverage is 42.24%. Comparing base (9c0bd72) to head (1eb2f9e). Report is 13 commits behind head on master.

Files with missing lines Patch % Lines
pkg/karmadactl/util/cluster.go 0.00% 56 Missing :warning:
pkg/karmadactl/unregister/unregister.go 7.69% 12 Missing :warning:
pkg/karmadactl/unjoin/unjoin.go 0.00% 3 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4451      +/-   ##
==========================================
- Coverage   42.25%   42.24%   -0.02%     
==========================================
  Files         655      655              
  Lines       55756    55847      +91     
==========================================
+ Hits        23561    23593      +32     
- Misses      30683    30740      +57     
- Partials     1512     1514       +2     
Flag Coverage Δ
unittests 42.24% <1.38%> (-0.02%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Dec 19 '23 08:12 codecov-commenter

This request I think is reasonable, but I don't think just deleting the cluster can solve it, it should also forcibly delete the work under this cluster. It should also forcibly delete other resources like token,secret,etc.

In fact, I achieve the purpose of forced deletion by deleting the finalizer that blocks the deletion step. If the cluster object can be deleted successfully, then resources such as secrets and tokens can also be deleted successfully.

zhzhuang-zju avatar Dec 26 '23 07:12 zhzhuang-zju

The issue can be resolved by setting PropagationPolicy.Spec.PreserveResourcesOnDeletion.

zhzhuang-zju avatar Oct 12 '24 08:10 zhzhuang-zju

The issue can be resolved by setting PropagationPolicy.Spec.PreserveResourcesOnDeletion.

Apologies for the confusion between the two scenarios earlier. PropagationPolicy.Spec.PreserveResourcesOnDeletion is used to retain resources in the member cluster when a workload is deleted. Meanwhile, karmadactl join --force is used to delete cluster and secret resources even if resources in the cluster targeted for unjoin are not removed successfully. I will track the progress of this feature through #5477.

zhzhuang-zju avatar Oct 14 '24 03:10 zhzhuang-zju

cc @chaunceyjiang @chaosi-zju

zhzhuang-zju avatar Oct 23 '24 01:10 zhzhuang-zju

Thank you for your quick fix, generally looks good

chaosi-zju avatar Oct 23 '24 06:10 chaosi-zju

you shall resolve the conflict and do a rebase.

chaosi-zju avatar Oct 31 '24 03:10 chaosi-zju

Enhanced forced deletion on karmadactl unjoin

you shall resolve the conflict and do a rebase.

done

zhzhuang-zju avatar Nov 01 '24 06:11 zhzhuang-zju

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RainbowMango

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment Approvers can cancel approval by writing /approve cancel in a comment

karmada-bot avatar Nov 04 '24 03:11 karmada-bot