[Bug] Fails to delete cluster when envelope encryption secret is gone
What were you trying to accomplish?
Delete the cluster. Prior, the envelope encryption key has been deleted (in my case, probably during an incomplete or failed deletion attempt of the surrounding infrastructure).
What happened?
Deleting the cluster fails:
eksctl delete cluster -f eksctl-ClusterConfig.yaml --disable-nodegroup-eviction
2025-12-01 10:53:59 [ℹ] deleting EKS cluster "xxx"
2025-12-01 10:54:00 [ℹ] deleted 0 Fargate profile(s)
2025-12-01 10:54:00 [✔] kubeconfig has been updated
2025-12-01 10:54:00 [ℹ] cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress
Error: cannot delete Kubernetes Ingress default/xxx: Internal error occurred: failed to decrypt DEK, error: rpc error: code = Unknown desc = failed to decrypt operation error KMS: Decrypt, https response error StatusCode: 400, RequestID: xxx, KMSInvalidStateException: arn:aws:kms:eu-central-1:xxx:key/xxxx is pending deletion.
With --force, it just gets stuck at cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress.
How to reproduce it?
Create a cluster with an ecryption secret.
secretsEncryption:
keyARN: arn:aws:kms:eu-central-1:xxx:key/xxxx
Then, first delete that KMS key, and then delete the cluster.
Anything else we need to know?
A good solution would be to add a timeout around the "cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress", and print a warning if there can be leftovers, but continue deleting the cluster with --force.
Versions
$ eksctl info
0.214.0
Do you know how long you waited?
A good solution would be to add a timeout around the "cleaning up AWS load balancers created by Kubernetes objects of Kind Service or Ingress", and print a warning if there can be leftovers, but continue deleting the cluster with --force.
I believe we already have this added but it is possible it is not working as expected https://github.com/eksctl-io/eksctl/blob/9be9168639d42d57504250d47270556cb964a7a8/pkg/actions/cluster/delete.go#L57-L65
I'm also not entirely sure what is getting stuck, since without force you are getting an internal error right?
@NicholasBlaskey TBH I do not know, but usually this is one of these jobs I send off in a terminal, do something else, and check back around >30min later. As I share that cluster with my team, I cannot test deletion immediately, but when the next opportunity arises, I'll try to delete that key by hand first to replicate...