kops update cluster --target=terraform panics for a GCE cluster with a bastion InstanceGroup
/kind bug
1. What kops version are you running? The command kops version, will display
this information.
Client version: 1.26.3 (git-v1.26.3)
As far as I know, every kops version is affected.
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.
1.24.14
3. What cloud provider are you using?
GCE
4. What commands did you run? What is the simplest way to reproduce this issue?
On an existing kops GCE cluster, create a bastion InstanceGroup as described in kops docs:
kops create ig --name k8s.cluster bastions --role Bastion --subnet utility-europe-west4
kops update cluster k8s.cluster --target=terraform --out=terraform/
5. What happened after the commands executed?
First, with the InstanceGroup manifest generated by the kops create ig command above, kops panics with an index out of range error:
W0609 11:31:40.268371 24080 external_access.go:39] TODO: Harmonize gcemodel ExternalAccessModelBuilder with awsmodel
W0609 11:31:40.268562 24080 firewall.go:41] TODO: Harmonize gcemodel with awsmodel for firewall - GCE model is way too open
W0609 11:31:40.268749 24080 storageacl.go:165] adding bucket level write IAM for role "redacted" to gs://ControlPlane to support etcd backup
panic: runtime error: index out of range [0] with length 0
goroutine 1 [running]:
k8s.io/kops/pkg/model/gcemodel.(*AutoscalingGroupModelBuilder).splitToZones(0x58661c0?, 0xc000b8ac00)
k8s.io/kops/pkg/model/gcemodel/autoscalinggroup.go:231 +0x1aa
k8s.io/kops/pkg/model/gcemodel.(*AutoscalingGroupModelBuilder).Build(0xc00069c220, 0xc0015e16c0?)
k8s.io/kops/pkg/model/gcemodel/autoscalinggroup.go:269 +0x125
k8s.io/kops/upup/pkg/fi/cloudup.(*Loader).BuildTasks(0xc0005f9738, {0x58642b0, 0xc00012e000}, 0xc0005de4b0)
k8s.io/kops/upup/pkg/fi/cloudup/loader.go:47 +0x124
k8s.io/kops/upup/pkg/fi/cloudup.(*ApplyClusterCmd).Run(0xc0005f9bd8, {0x58642b0, 0xc00012e000})
k8s.io/kops/upup/pkg/fi/cloudup/apply_cluster.go:700 +0x54c5
main.RunUpdateCluster({0x58642b0, 0xc00012e000}, 0xc00055e2c0, {0x5838820, 0xc000130008}, 0xc000786790)
k8s.io/kops/cmd/kops/update_cluster.go:293 +0xbb3
main.NewCmdUpdateCluster.func1(0xc0005c1200?, {0xc000b6be30?, 0x3?, 0x3?})
k8s.io/kops/cmd/kops/update_cluster.go:110 +0x3a
github.com/spf13/cobra.(*Command).execute(0xc0005c1200, {0xc000b6bdd0, 0x3, 0x3})
github.com/spf13/[email protected]/command.go:916 +0x862
github.com/spf13/cobra.(*Command).ExecuteC(0x7d1b2c0)
github.com/spf13/[email protected]/command.go:1044 +0x3bd
github.com/spf13/cobra.(*Command).Execute(...)
github.com/spf13/[email protected]/command.go:968
github.com/spf13/cobra.(*Command).ExecuteContext(...)
github.com/spf13/[email protected]/command.go:961
main.Execute({0x58642b0?, 0xc00012e000})
k8s.io/kops/cmd/kops/root.go:95 +0xab
main.main()
k8s.io/kops/cmd/kops/main.go:23 +0x27
This can be fixed by manually specifying a zone in spec.zones array, for example europe-west4-b. Running the same kops update cluster command after that causes a segfault with this output:
W0609 11:33:43.251912 24895 external_access.go:39] TODO: Harmonize gcemodel ExternalAccessModelBuilder with awsmodel
W0609 11:33:43.252209 24895 firewall.go:41] TODO: Harmonize gcemodel with awsmodel for firewall - GCE model is way too open
W0609 11:33:43.252443 24895 storageacl.go:165] adding bucket level write IAM for role "redacted" to gs://ControlPlane to support etcd backup
W0609 11:33:43.252760 24895 autoscalinggroup.go:130] enabling storage-rw for etcd backups
W0609 11:33:43.253004 24895 autoscalinggroup.go:130] enabling storage-rw for etcd backups
W0609 11:33:43.253197 24895 autoscalinggroup.go:130] enabling storage-rw for etcd backups
I0609 11:33:43.258275 24895 executor.go:111] Tasks: 0 done / 95 total; 54 can run
I0609 11:33:43.298027 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.325407 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.360259 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.390925 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.422149 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.457276 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.494929 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.531397 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.562449 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.593414 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.623793 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.654772 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.687333 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.716446 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.749615 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.779673 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.807920 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.839785 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.901680 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.932934 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:43.966634 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:44.000258 24895 storage.go:65] bucket gs://redacted has bucket-policy only; won't try to set ACLs
I0609 11:33:44.000413 24895 executor.go:111] Tasks: 54 done / 95 total; 19 can run
I0609 11:33:44.354351 24895 executor.go:111] Tasks: 73 done / 95 total; 15 can run
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x20c1c89]
goroutine 627 [running]:
k8s.io/kops/upup/pkg/fi.CopyResource({0x582fa00, 0xc000b78450}, {0x0?, 0x0?})
k8s.io/kops/upup/pkg/fi/resources.go:85 +0x69
k8s.io/kops/upup/pkg/fi.ResourceAsString({0x0, 0x0})
k8s.io/kops/upup/pkg/fi/resources.go:103 +0x4c
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks.(*InstanceTemplate).mapToGCE(0xc00107a1e0, {0xc000980cd8, 0x13}, {0xc000a0fe00, 0xc})
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks/instancetemplate.go:347 +0xb88
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks.(*InstanceTemplate).RenderTerraform(0x5?, 0xc000638f00, 0xc0008b4600?, 0xc00107a1e0, 0x2?)
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks/instancetemplate.go:607 +0x73
reflect.Value.call({0x4b1c2a0?, 0xc00107a1e0?, 0x6?}, {0x4e08ac0, 0x4}, {0xc00109ac00, 0x4, 0x5885130?})
reflect/value.go:584 +0x8c5
reflect.Value.Call({0x4b1c2a0?, 0xc00107a1e0?, 0x4e450bb?}, {0xc00109ac00?, 0xc001348260?, 0xc0015bc060?})
reflect/value.go:368 +0xbc
k8s.io/kops/upup/pkg/fi.(*Context[...]).Render(0xc00132e5a0, {0x5837ca0, 0x0?}, {0x5837ca0, 0xc00107a1e0?}, {0x5837ca0, 0xc001318000?})
k8s.io/kops/upup/pkg/fi/context.go:237 +0x10f5
k8s.io/kops/upup/pkg/fi.defaultDeltaRunMethod[...]({0x5837ca0, 0xc00107a1e0?}, 0xc00132e5a0)
k8s.io/kops/upup/pkg/fi/default_methods.go:100 +0x4e5
k8s.io/kops/upup/pkg/fi.CloudupDefaultDeltaRunMethod(...)
k8s.io/kops/upup/pkg/fi/default_methods.go:41
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks.(*InstanceTemplate).Run(0xc001328300?, 0x5837ca0?)
k8s.io/kops/upup/pkg/fi/cloudup/gcetasks/instancetemplate.go:233 +0x2d
k8s.io/kops/upup/pkg/fi.(*executor[...]).forkJoin.func1(0x4)
k8s.io/kops/upup/pkg/fi/executor.go:195 +0x290
created by k8s.io/kops/upup/pkg/fi.(*executor[...]).forkJoin
k8s.io/kops/upup/pkg/fi/executor.go:183 +0xbe
6. What did you expect to happen?
The Terraform code for deploying a bastion to be generated
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.
apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
creationTimestamp: null
generation: 1
name: <redacted>
spec:
api:
loadBalancer:
type: Public
authorization:
rbac: {}
channel: stable
cloudConfig:
gcpPDCSIDriver:
enabled: false
manageStorageClasses: false
cloudProvider: gce
configBase: <redacted>
containerRuntime: containerd
containerd:
configOverride: |
version = 2
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
[plugins."io.containerd.grpc.v1.cri".containerd]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
dnsZone: <redacted>
etcdClusters:
- cpuRequest: 200m
etcdMembers:
- instanceGroup: control-plane-europe-west4-a
name: a
- instanceGroup: control-plane-europe-west4-b
name: b
- instanceGroup: control-plane-europe-west4-c
name: c
manager:
env:
- name: ETCD_LISTEN_METRICS_URLS
value: http://0.0.0.0:8081
- name: ETCD_METRICS
value: extended
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 30d
memoryRequest: 100Mi
name: main
- cpuRequest: 100m
etcdMembers:
- instanceGroup: control-plane-europe-west4-a
name: a
- instanceGroup: control-plane-europe-west4-b
name: b
- instanceGroup: control-plane-europe-west4-c
name: c
manager:
env:
- name: ETCD_LISTEN_METRICS_URLS
value: http://0.0.0.0:8082
- name: ETCD_METRICS
value: extended
- name: ETCD_MANAGER_HOURLY_BACKUPS_RETENTION
value: 1d
- name: ETCD_MANAGER_DAILY_BACKUPS_RETENTION
value: 7d
memoryRequest: 100Mi
name: events
iam:
allowContainerRegistry: true
legacy: false
kubeAPIServer:
auditLogMaxAge: 5
auditLogMaxBackups: 1
auditLogMaxSize: 100
auditLogPath: /var/log/kube-apiserver-audit.log
auditPolicyFile: /srv/kubernetes/kube-apiserver/audit.conf
defaultNotReadyTolerationSeconds: 150
defaultUnreachableTolerationSeconds: 150
disableBasicAuth: true
enableProfiling: false
eventTTL: 6h0m0s
featureGates:
EphemeralContainers: "true"
logFormat: json
kubeControllerManager:
configureCloudRoutes: true
featureGates:
EphemeralContainers: "true"
InTreePluginGCEUnregister: "true"
horizontalPodAutoscalerDownscaleDelay: 3m0s
horizontalPodAutoscalerSyncPeriod: 15s
horizontalPodAutoscalerUpscaleDelay: 3m0s
logFormat: json
kubeDNS:
nodeLocalDNS:
enabled: true
provider: CoreDNS
kubeProxy:
metricsBindAddress: 0.0.0.0
kubeScheduler:
logFormat: json
kubelet:
anonymousAuth: false
authenticationTokenWebhook: true
cgroupDriver: systemd
featureGates:
EphemeralContainers: "true"
InTreePluginGCEUnregister: "true"
logFormat: json
kubernetesApiAccess:
- <redacted>
kubernetesVersion: 1.24.14
masterPublicName: <redacted>
networking:
canal: {}
nonMasqueradeCIDR: 100.64.0.0/10
ntp:
managed: false
project: <redacted>
sshAccess:
- <redacted>
subnets:
- cidr: 10.0.16.0/20
egress: External
name: cluster-europe-west4
region: europe-west4
type: Private
- cidr: 10.0.32.0/20
egress: External
name: utility-europe-west4
region: europe-west4
type: Utility
topology:
dns:
type: Public
masters: private
nodes: private
Hi @tesspib. Could you check this again when kOps 1.27.0 is released? There are many GCE related improvements there.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Reopen this issue with
/reopen - Mark this issue as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closedYou can:
- Reopen this issue with
/reopen- Mark this issue as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.