`create cluster` fails when VPC CNI is configured to use both `iam.withOIDC` and `useDefaultPodIdentityAssociations`
The following config results in a panic:
apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig
metadata:
name: test-cluster-3
region: us-east-1
version: '1.28'
addons:
- name: eks-pod-identity-agent
version: v1.3.0
- name: vpc-cni
version: v1.18.2
useDefaultPodIdentityAssociations: true
iam:
withOIDC: true
secretsEncryption:
keyARN: arn:aws:kms:us-east-1:123456789:alias/test-kms
Stack trace -
2024-08-02 14:56:26 [ℹ] creating addon
2024-08-02 14:56:27 [ℹ] successfully created addon
2024-08-02 14:56:28 [ℹ] "addonsConfig.autoApplyPodIdentityAssociations" is set to true; will lookup recommended pod identity configuration for "vpc-cni" addon
2024-08-02 14:56:30 [ℹ] deploying stack "eksctl-test-cluster-3-addon-vpc-cni-podidentityrole-aws-node"
2024-08-02 14:56:30 [ℹ] waiting for CloudFormation stack "eksctl-test-cluster-3-addon-vpc-cni-podidentityrole-aws-node"
2024-08-02 14:57:01 [ℹ] waiting for CloudFormation stack "eksctl-test-cluster-3-addon-vpc-cni-podidentityrole-aws-node"
2024-08-02 14:57:02 [ℹ] creating addon
2024-08-02 14:57:03 [ℹ] successfully created addon
2024-08-02 14:57:04 [ℹ] creating addon
2024-08-02 14:57:04 [ℹ] successfully created addon
2024-08-02 14:57:05 [ℹ] creating addon
2024-08-02 14:57:06 [ℹ] successfully created addon
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x20 pc=0x1055303f8]
goroutine 187 [running]:
github.com/weaveworks/eksctl/pkg/actions/addon.(*Manager).Update(0x1400073f4a0, {0x107bc4e68, 0x10a3332e0}, 0x140005c2b40, {0x0, 0x0}, 0x15d3ef79800)
github.com/weaveworks/eksctl/pkg/actions/addon/update.go:121 +0xeb8
github.com/weaveworks/eksctl/pkg/actions/addon.CreateAddonTasks.func3()
github.com/weaveworks/eksctl/pkg/actions/addon/tasks.go:110 +0x90
github.com/weaveworks/eksctl/pkg/utils/tasks.(*GenericTask).Do(0x14000a2bd58, 0x0?)
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:31 +0x34
github.com/weaveworks/eksctl/pkg/utils/tasks.doSingleTask(0x0?, {0x107b74ac0, 0x14000a2bd58})
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:202 +0xc8
github.com/weaveworks/eksctl/pkg/utils/tasks.doSequentialTasks(0x1400061b4e0?, {0x1400061e980, 0x5, 0x1400022c160?})
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:250 +0x6c
created by github.com/weaveworks/eksctl/pkg/utils/tasks.(*TaskTree).Do in goroutine 185
github.com/weaveworks/eksctl/pkg/utils/tasks/tasks.go:158 +0x258
Not sure if related, but I found that ekctl 0.187.0 falsely complains in logs during create cluster when vpc-cni addon is specified without pod identity, but with attachPolicyARNs:
IRSA config is set for "vpc-cni" addon, but since OIDC is disabled on the cluster, eksctl cannot configure the requested permissions; the recommended way to provide IAM permissions for "vpc-cni" addon is via pod identity associations; after addon creation is completed, add all recommended policies to the config file, under
addon.PodIdentityAssociations, and runeksctl update addon
The cluster config does have iam.withOIDC: true, and OIDC works without issues when cluster is created.
can confirm @artem-nefedov 's experience. Had the same error message, despite ODIC being true. eksctl version is 0.190.0
I have the same issue with 0.191.0-dev+c736924d6.2024-09-27T00:54:42Z. I've setup vpc-cni with the following settings:
addons:
- name: vpc-cni
podIdentityAssociations:
- namespace: kube-system
permissionPolicyARNs: ["arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy"]
serviceAccountName: aws-node
Seems like the same issue as #7951.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
I am the original person that brought this issue to AWS support. Just commenting to avoid ticket closure.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
remove stale
Tested with 0.200.0, issue still there.
Update: seems like GHA won't reopen the issue, create a new issue #8141