Stack removal fails
Hey,
we have been using cdk-eks-karpenter for a while now and we have been experiencing issues with the removal of stacks where karpenter has been installed using this package. Basically CloudFormation triggers the delete on the CustomResource which installed the yaml file into the cluster that then fails / times out. In the EKS console all the nodes have already been removed and the cluster is pretty much only still existing on paper (but I cannot connect with kubectl to it anymore). Eventually the CustomResource times out after 1h and CloudFormation fails.
We have produced this sort of minimal example where the error still occurs and where we do nothing more than just creating a cluster within our pre-created VPC and then install karpenter using this package.
import { CONFIG } from '@/src/config';
import { vpcName } from '@/src/utils';
import { KubectlV28Layer } from '@aws-cdk/lambda-layer-kubectl-v28';
import { Stack, StackProps } from 'aws-cdk-lib';
import { InstanceClass, InstanceSize, InstanceType, IVpc, Vpc } from 'aws-cdk-lib/aws-ec2';
import { Cluster, KubernetesVersion } from 'aws-cdk-lib/aws-eks';
import { ManagedPolicy } from 'aws-cdk-lib/aws-iam';
import { Karpenter } from 'cdk-eks-karpenter';
import { Construct } from 'constructs';
export class NodeAutoscaling extends Construct {
constructor(
scope: Construct,
id: string,
{
cluster,
subnetIds,
}: {
cluster: Cluster;
subnetIds: string[];
},
) {
super(scope, id);
const karpenter = new Karpenter(this, 'Karpenter', {
cluster,
namespace: 'karpenter',
version: 'v0.34.1',
});
const nodeClass = karpenter.addEC2NodeClass('nodeclass', {
amiFamily: 'AL2',
subnetSelectorTerms: subnetIds.map((subnetId) => ({ id: subnetId })),
securityGroupSelectorTerms: [
{
tags: {
'aws:eks:cluster-name': cluster.clusterName,
},
},
],
role: karpenter.nodeRole.roleName,
});
karpenter.addNodePool('nodepool', {
template: {
spec: {
nodeClassRef: {
apiVersion: 'karpenter.k8s.aws/v1beta1',
kind: 'EC2NodeClass',
name: nodeClass.name,
},
requirements: [
{
key: 'karpenter.sh/capacity-type',
operator: 'In',
values: ['on-demand'],
},
{
key: 'karpenter.k8s.aws/instance-category',
operator: 'In',
values: ['m'],
},
{
key: 'karpenter.k8s.aws/instance-generation',
operator: 'In',
values: ['5', '6', '7'],
},
{
key: 'kubernetes.io/arch',
operator: 'In',
values: ['amd64'],
},
],
},
},
});
karpenter.addManagedPolicyToKarpenterRole(
ManagedPolicy.fromAwsManagedPolicyName('AmazonSSMManagedInstanceCore'),
);
}
}
export class EksCluster extends Construct {
public readonly cluster: Cluster;
constructor(
scope: Construct,
id: string,
{
environment,
instanceName,
vpc,
}: {
environment: string;
instanceName: string;
vpc: IVpc;
},
) {
super(scope, id);
const kubectlLayer = new KubectlV28Layer(this, 'KubectlLayer');
this.cluster = new Cluster(this, 'Cluster', {
clusterName: `eks-example-${instanceName}-${environment}`,
defaultCapacity: 3,
defaultCapacityInstance: InstanceType.of(InstanceClass.M5, InstanceSize.LARGE),
kubectlLayer,
outputConfigCommand: true,
outputMastersRoleArn: true,
version: KubernetesVersion.V1_28,
vpc,
});
new NodeAutoscaling(this, 'NodeAutoscaling', {
cluster: this.cluster,
subnetIds: vpc.privateSubnets.map(({ subnetId }) => subnetId), // the landing zone creates the subnets in the following pattern <vpcId>-<private|public>-<AZ>
});
}
}
export class MinBrokenEks extends Stack {
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
const vpc = Vpc.fromLookup(this, 'Vpc', { vpcName: vpcName(CONFIG.environment) });
this.configureClusterAndRoles({ vpc });
}
private configureClusterAndRoles({ vpc }: { vpc: IVpc }) {
const cluster = new EksCluster(this, 'EksCluster', {
environment: CONFIG.environment,
instanceName: CONFIG.instanceName,
vpc,
});
return cluster;
}
}
Hi @lucavb, thanks for reporting this! Could you please clarify which resource you are referring to that fails to delete here? Is it the helm resource which installs Karpenter?
Hey @andskli it seems to be the CustomResource that installs either the EC2NodeClass or the NodePool. As I have said the cluster is basically without nodes at that point and the custom resource that should remove those two resources just times out. Does that help?
Edit: So I recreated my example in our account and here you see the failed resource:
The resource that could not be removed:
And the lambda that times out:
I'm not sure if this is related, but we also just ran into an issue deleting the stack. In our case it failed on trying to delete the NodeRole and a NodeClass. In cloudfromation the event error message points to the instance profile:
Karpenter Node Role:
Resource handler returned message: "Cannot delete entity, must remove roles from instance profile first.
A node class that we provisioned using karpenter.addNodeClass
CloudFormation did not receive a response from your Custom Resource. Please check your logs for requestId [0a56d113-0455-4c30-bca2-9b64cb2be7fa]. If you are using the Python cfn-response module, you may need to update your Lambda function code so that CloudFormation can attach the updated version.
All we did in this case was add karpenter to an existing stack, provision a node class to test, and then tried to tear it down.
@andskli is there any update on this?
Have not had much time to look at this. Had a quick check-in and I am able to reproduce using your example, so thanks for that @lucavb.
Leaving the following as a note to future self or anyone willing to pick this issue up in the next few weeks as I won't be able to (summer holiday):
What seems to happen is that the EC2NodeClass doesn't get deleted because of the finalizer applied to the resource. I am not sure exactly how to solve this, perhaps we can utilize dependencies between NodePool and EC2NodeClass in a clever way somehow, or perhaps we can work on getting a force delete/remove finalizer option into upstream CDK resource which addEC2NodeClass() and addNodePool() uses under the hood.
Status updates:
- Observation: If I only deploy the karpenter helm-chart, and don't deploy any karpenter kubernetes CRD yaml manifests.
const karpenter = new Karpenter(stack, 'Karpenter', {
cluster: cluster,
namespace: 'kube-system',
version: '1.3.3', //https://gallery.ecr.aws/karpenter/karpenter
nodeRole: config.karpenterNodeRole, //custom NodeRole to pass to Karpenter Nodes
helmExtraValues: { //https://github.com/aws/karpenter-provider-aws/blob/v1.3.3/charts/karpenter/values.yaml
replicas: 1,
},
});
karpenter.node.addDependency(cluster.awsAuth); //editing order of operations to say deploy karpenter after cluster exists
Then cdk destroy still fails, and it fails immediately within 1 second of running.
cdk destroy dev1-eks
Here's the error message I get:
10:43:34 AM | DELETE_FAILED | Custom::AWSCDK-EKS-HelmChart | dev1-eks/chart-kar...r/Resource/Default
Received response status [FAILED] from custom resource. Message returned: TooManyRequestsException: Rate Exceeded.
at de_TooManyRequestsExceptionRes (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/index.js:4589:21)
at de_CommandError (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/index.js:3969:19)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20
at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/core/dist-cjs/index.js:167:18
at async /var/runtime/node_modules/@aws-sdk/node_modules/@smithy/middleware-retry/dist-cjs/index.js:321:38
at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:33:22
at async invokeUserFunction (/var/task/framework.js:1:2794)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1837) (RequestId: e4d6dd0d-95e4-4ede-af38-a83298588e81)
Now for the good news :), I made some progress on this:
console.log(config.id); //gives 'dev1-eks', it's important not to pass stack.stackId / verify what the console.log shows, stack.stackID gives a placeholder value '${Token[AWS::StackId.1876]}'
const karpenter_helm_chart_CFR = (stack.node.tryFindChild(config.id)?.node.tryFindChild('chart-karpenter')?.node.defaultChild as cdk.CfnResource);
if(karpenter_helm_chart_CFR){ //if not null then do the following
console.log("setting karpenter to retain to workaround cdk destroy bug")
karpenter_helm_chart_CFR.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN); //Workaround for cdk destroy bug
}
My CDK stack includes the EKS cluster, and I reasoned that if I destroy the cluster, then any kubernetes helm charts / yaml manifests will also get destroyed anyways.
So despite the cdk.RemovalPolicy.RETAIN, the effective reality is that they do get destroyed, but RETAIN is enough to workaround the issue.
btw @lucavb could you rename the title from "Stack removal fails" to something more search optimized like "cdk destroy / stack removal / karpenter delete fails" I almost missed this / almost made a duplicate issue.
Small Correction/Clarification: You also need to set that for any karpenter specific kubernetes yaml manifests deployed through CDK.
const karpenter_YAMLs = karpenter_YAML_generator.generate_manifests();
const apply_karpenter_YAML = new eks.KubernetesManifest(stack, 'karpenter_YAMLs',
{
cluster: cluster,
manifest: karpenter_YAMLs,
overwrite: true,
prune: false,
}
);
const apply_karpenter_YAML_CFR = (apply_karpenter_YAML.node.defaultChild as cdk.CfnResource)
if(apply_karpenter_YAML_CFR){ //if not null then do the following
apply_karpenter_YAML_CFR.applyRemovalPolicy(cdk.RemovalPolicy.RETAIN); //Workaround for cdk destroy bug
}
apply_karpenter_YAML.node.addDependency(karpenter); //Inform cdk of order of operations
^-- The only downside of this I can think of is it might result in orphaned ec2 instances from karpenter nodeclaims after destroy has occured.
I personally still think it's a good short term improvement, because when destroy fails it's currently a ~2 hour cleanup with some manual human interaction needed every hour / need to run destroy more than once.
I'm just theorizing that that's a problem, but if my hunch is right, then the following strategy might also allow an automated cleanup workaround to be implemented while we wait for this to be fixed. https://dev.to/aws-builders/aws-custom-resource-using-cdk-387k This article says it's theoretically possible to trigger custom lambda logic to trigger on Delete. I have no idea how to do that. I went with a different strategy of customizing my kubernetes karpenter yaml to customize ec2 tags, so if an orphaned ec2 instance existed after destroy, it'd be trivially easy to manually clean them up by filtering in the GUI and terminating.