cluster-api-provider-gcp icon indicating copy to clipboard operation
cluster-api-provider-gcp copied to clipboard

GKE cluster not able to scale-up

Open jayesh-srivastava opened this issue 8 months ago • 3 comments
trafficstars

/kind bug

What steps did you take and what happened: Created a GKE cluster with the latest CAPG version. After performing clusterctl move on the target cluster, performed a scale-up of the worker pool. The scale-up operation has been stuck with the following logs in the capg pod.

I0309 16:06:53.828189       1 gcpmanagedmachinepool_controller.go:321] "Reconciling GCPManagedMachinePool" controller="gcpmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="GCPManagedMachinePool" GCPManagedMachinePool="default/js-clusterctl-gke-mp-0" namespace="default" name="js-clusterctl-gke-mp-0" reconcileID="f2bc8495-2b34-4a21-bf0d-7e9ae221b689" controller="gcpmanagedmachinepool"
I0309 16:06:53.829021       1 reconcile.go:61] "Reconciling node pool resources" controller="gcpmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="GCPManagedMachinePool" GCPManagedMachinePool="default/js-clusterctl-gke-mp-0" namespace="default" name="js-clusterctl-gke-mp-0" reconcileID="f2bc8495-2b34-4a21-bf0d-7e9ae221b689"
I0309 16:06:54.317482       1 reconcile.go:141] "Node pool running" controller="gcpmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="GCPManagedMachinePool" GCPManagedMachinePool="default/js-clusterctl-gke-mp-0" namespace="default" name="js-clusterctl-gke-mp-0" reconcileID="f2bc8495-2b34-4a21-bf0d-7e9ae221b689"
I0309 16:06:54.317848       1 reconcile.go:149] "Node pool config update required" controller="gcpmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="GCPManagedMachinePool" GCPManagedMachinePool="default/js-clusterctl-gke-mp-0" namespace="default" name="js-clusterctl-gke-mp-0" reconcileID="f2bc8495-2b34-4a21-bf0d-7e9ae221b689" request="name:\"projects/palette-pcp-08fkpbtkx7xcwa7xn2/locations/us-central1/clusters/default-js-clusterctl-gke-control-plane/nodePools/js-clusterctl-gke-mp-0\"  resource_labels:{labels:{key:\"capg-cluster-default-js-clusterctl-gke-control-plane\"  value:\"owned\"}}"
I0309 16:06:54.883397       1 reconcile.go:154] "Node pool config updating in progress" controller="gcpmanagedmachinepool" controllerGroup="infrastructure.cluster.x-k8s.io" controllerKind="GCPManagedMachinePool" GCPManagedMachinePool="default/js-clusterctl-gke-mp-0" namespace="default" name="js-clusterctl-gke-mp-0" reconcileID="f2bc8495-2b34-4a21-bf0d-7e9ae221b689"

The following logs are seen in the capi-controller-manager pod.

E0309 18:20:36.610034       1 cluster_controller_status.go:915] "Failed to aggregate ControlPlane, MachinePool, MachineDeployment, MachineSet's ScalingUp conditions" err="sourceObjs can't be empty" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/js-clusterctl-gke" namespace="default" name="js-clusterctl-gke" reconcileID="4fa19d6a-b8f8-4e57-ab42-80d00d5f5b0e"
E0309 18:20:36.610272       1 cluster_controller_status.go:992] "Failed to aggregate ControlPlane, MachinePool, MachineDeployment, MachineSet's ScalingDown conditions" err="sourceObjs can't be empty" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/js-clusterctl-gke" namespace="default" name="js-clusterctl-gke" reconcileID="4fa19d6a-b8f8-4e57-ab42-80d00d5f5b0e"
E0309 18:30:28.961323       1 cluster_controller_status.go:838] "Failed to aggregate ControlPlane, MachinePool, MachineDeployment's RollingOut conditions" err="sourceObjs can't be empty" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/js-clusterctl-gke" namespace="default" name="js-clusterctl-gke" reconcileID="e2c45641-62bf-458e-bb9c-972005cd23e0"
E0309 18:30:28.961510       1 cluster_controller_status.go:915] "Failed to aggregate ControlPlane, MachinePool, MachineDeployment, MachineSet's ScalingUp conditions" err="sourceObjs can't be empty" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/js-clusterctl-gke" namespace="default" name="js-clusterctl-gke" reconcileID="e2c45641-62bf-458e-bb9c-972005cd23e0"
E0309 18:30:28.962725       1 cluster_controller_status.go:992] "Failed to aggregate ControlPlane, MachinePool, MachineDeployment, MachineSet's ScalingDown conditions" err="sourceObjs can't be empty" controller="cluster" controllerGroup="cluster.x-k8s.io" controllerKind="Cluster" Cluster="default/js-clusterctl-gke" namespace="default" name="js-clusterctl-gke" reconcileID="e2c45641-62bf-458e-bb9c-972005cd23e0"

What did you expect to happen: I expected the scale-up operation to be successful.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]

Environment:

  • Cluster-api version: v1.9.4
  • Minikube/KIND version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

jayesh-srivastava avatar Mar 09 '25 18:03 jayesh-srivastava