cluster-api-provider-aws
cluster-api-provider-aws copied to clipboard
ROSA: RosaMachinePool and MachinePool CRs stuck at de-provision ROSA-HCP
/kind bug
What steps did you take and what happened:
We using gitOps workflow to provision ROSA-HCP. The required CRs; RosaControlPlane, RosaCluster, RosaMachinePool, Cluster and MachinePool are stored in a git repo and ArgoCDs is used to Sync the CRs to the installer cluster.
At the time to de-provision the ROSA-HCP;
1- Delete all the required CRs from the git repo
2- Let ArgoCD sync the git repo status and delete all the required CRs from the installer cluster.
3- The ROSA-HCP start the uninstall process
4- The RosaControlPlace and RosaCluster CRs are deleted however RosaMachinePool, MachinePool and Cluster CR stuck never get deleted.
5- Checking the aws console and the redhat console all the ROSA-HCP and AWS resources are destroyed.
6- After manually cleaning the finalizers from RosamachinePool and MachinePool CRs the Cluster CR is deleted.
What did you expect to happen: We expect at the cluster uninstall all the CRs get deleted.
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] Checking the logs for capa-controller-manager deployment during the uninstall process, the logs below is shown
E0415 22:22:23.974724 1 controller.go:329] "Reconciler error" err="Node pools can only be deleted on clusters in 'ready' state, cluster requested is in 'uninstalling' state." controller="rosamachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ROSAMachinePool" ROSAMachinePool="ns-rosa-hcp/workers-ex" namespace="ns-rosa-hcp" name="workers-ex" reconcileID="4a8dfbec-4fb6-48ff-aa07-bf75ba7cd31b"
After the aws and rosa-hcp resources destroyed the logs below is shown
I0415 22:42:21.241631 1 rosamachinepool_controller.go:142] "Failed to retrieve ControlPlane from MachinePool" controller="rosamachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="ROSAMachinePool" ROSAMachinePool="ns-rosa-hcp/workers-ex" namespace="ns-rosa-hcp" name="workers-ex" reconcileID="78e9b21b-937c-41fe-9550-34b974be85dc" MachinePool="ns-rosa-hcp/workers-ex" cluster="ns-rosa-hcp/rosa-hcp-2"
The Logs for capi-controller-manager deployment during the uninstall process, the logs below is shown
I0415 22:49:51.074671 1 cluster_controller.go:269] "Cluster still has descendants - need to requeue" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="ns-rosa-hcp/rosa-hcp-2" namespace="ns-rosa-hcp" name="rosa-hcp-2" reconcileID="6d892381-4d67-402d-97da-c4a2f76c7a57" descendants="Machine pools: workers-ex" indirect descendants count=0
E0415 22:49:54.492574 1 controller.go:329] "Reconciler error" err="failed to create cluster accessor: error fetching REST client config for remote cluster \"ns-rosa-hcp/rosa-hcp-2\": failed to retrieve kubeconfig secret for Cluster ns-rosa-hcp/rosa-hcp-2: Secret \"rosa-hcp-2-kubeconfig\" not found" controller="machinepool" controllerGroup="cluster.x-k8s.io" controllerKind="MachinePool" MachinePool="ns-rosa-hcp/workers-ex" namespace="ns-rosa-hcp" name="workers-ex" reconcileID="c2c91b09-2e38-4dfd-9a2f-c0478de02ac1"
capi-logs.txt capa-logs.txt rosa-hcp-2.txt
Environment:
-
Cluster-api-provider-aws version: v2.4.2
-
Kubernetes version: (use
kubectl version): v1.27.3 -
OS (e.g. from
/etc/os-release):
This issue is currently awaiting triage.
If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.
The triage/accepted label can be added by org members by writing /triage accepted in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
6- After manually cleaning the finalizers from RosamachinePool and MachinePool CRs the Cluster CR is deleted.
Can you share which finalizer specifically?
6- After manually cleaning the finalizers from RosamachinePool and MachinePool CRs the Cluster CR is deleted.
Can you share which finalizer specifically?
RosaMachinePool finalizer then MachinePool finalizer