cluster-api-provider-gcp When Zone not set, reconcile fails after 'clusterctl move'

trafficstars

/kind bug

What steps did you take and what happened: [A clear and concise description of what the bug is.]

Create a cluster using a bootstrap cluster and not set a failureDomain on a MachineDeployment
Run clusterctl move to move the CAPI resources to the workload cluster

I0707 05:50:46.383763       1 reconcile.go:38] controller/gcpmachine "msg"="Reconciling instance resources" "name"="gcp-test-3521295-md-0-brslr" "namespace"="default" "reconciler group"="infrastructure.cluster.x-k8s.io" "reconciler kind"="GCPMachine" 
panic: runtime error: index out of range [0] with length 0

goroutine 453 [running]:
sigs.k8s.io/cluster-api-provider-gcp/cloud/scope.(*MachineScope).Zone(0x29a4242)
	sigs.k8s.io/cluster-api-provider-gcp/cloud/scope/machine.go:103 +0x1d9
sigs.k8s.io/cluster-api-provider-gcp/cloud/scope.(*MachineScope).InstanceSpec(0xc000782840)
	sigs.k8s.io/cluster-api-provider-gcp/cloud/scope/machine.go:326 +0x4f
sigs.k8s.io/cluster-api-provider-gcp/cloud/services/compute/instances.(*Service).createOrGetInstance(0xc000b96240, {0x2d4eb20, 0xc0006e7200})
	sigs.k8s.io/cluster-api-provider-gcp/cloud/services/compute/instances/reconcile.go:134 +0xd2
sigs.k8s.io/cluster-api-provider-gcp/cloud/services/compute/instances.(*Service).Reconcile(0xc000b96240, {0x2d4eb20, 0xc0006e7200})
	sigs.k8s.io/cluster-api-provider-gcp/cloud/services/compute/instances/reconcile.go:39 +0x8a
sigs.k8s.io/cluster-api-provider-gcp/controllers.(*GCPMachineReconciler).reconcile(0x2db9ee8, {0x2d4eb20, 0xc0006e7200}, 0xc000782840)
	sigs.k8s.io/cluster-api-provider-gcp/controllers/gcpmachine_controller.go:228 +0x112
sigs.k8s.io/cluster-api-provider-gcp/controllers.(*GCPMachineReconciler).Reconcile(0xc0006a3530, {0x2d4eb58, 0xc0007d94d0}, {{{0xc000345930, 0x284c160}, {0xc0006b5200, 0x30}}})
	sigs.k8s.io/cluster-api-provider-gcp/controllers/gcpmachine_controller.go:216 +0xa10
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0xc0006a0840, {0x2d4eb58, 0xc0007d94a0}, {{{0xc000345930, 0x284c160}, {0xc0006b5200, 0x413c34}}})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:114 +0x26f
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0006a0840, {0x2d4eab0, 0xc000421dc0}, {0x256fe80, 0xc00045c420})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:311 +0x33e
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0006a0840, {0x2d4eab0, 0xc000421dc0})
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266 +0x205
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227 +0x85
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2
	sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:223 +0x357

What did you expect to happen: It should be possible to clusterctl move the cluster even when not setting a failureDomain after https://github.com/kubernetes-sigs/cluster-api-provider-gcp/pull/584.

Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.] This conversation seems relevant https://github.com/kubernetes-sigs/cluster-api-provider-gcp/pull/584#discussion_r852403537.

~Maybe we need to fallback to the API if len(zones])==0?~ After a move Cluster's Status is not yet known, but in either case we want to get the Machine's already assigned Zone, not one from Cluster's status.

Environment:

Cluster-api version:
Minikube/KIND version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

Jul 07 '22 13:07 dkoshkin

/assign

Jul 31 '22 11:07 aniruddha2000

@dkoshkin If you could help me with reproducing the issue like what is the command you executed? After which command did you run the clusterctl move?

Aug 02 '22 03:08 aniruddha2000

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Oct 31 '22 04:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Nov 30 '22 05:11 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Dec 30 '22 05:12 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Dec 30 '22 05:12 k8s-ci-robot

cluster-api-provider-gcp cluster-api-provider-gcp copied to clipboard

When Zone not set, reconcile fails after 'clusterctl move'

cluster-api-provider-gcp
cluster-api-provider-gcp copied to clipboard