cdk-eks-blueprints icon indicating copy to clipboard operation
cdk-eks-blueprints copied to clipboard

Can't install NTH with Karpenter

Open jdwil opened this issue 3 years ago • 7 comments

Describe the bug

Not sure if this is a bug, or I'm just doing it wrong... but all the documentation I've read for Karpenter suggests using AWS Node Termination Handler to gracefully handle spot instance termination.

Expected Behavior

I'd expect an EKS cluster to be deployed with both Karpenter and AwsNodeTerminationHandler installed (along with the rest of my addons).

Current Behavior

.../node_modules/@aws-quickstart/eks-blueprints/lib/addons/aws-node-termination-handler/index.ts:72
    assert(asgCapacity && asgCapacity.length > 0, 'AWS Node Termination Handler is only supported for self-managed nodes');
    ^
AssertionError [ERR_ASSERTION]: AWS Node Termination Handler is only supported for self-managed nodes

...

    at Module.load (node:internal/modules/cjs/loader:981:32)
    at Function.Module._load (node:internal/modules/cjs/loader:822:12) {
  generatedMessage: false,
  code: 'ERR_ASSERTION',
  actual: false,
  expected: true,
  operator: '=='
}

Reproduction Steps

import 'source-map-support/register';
import * as cdk from 'aws-cdk-lib';
import {aws_eks} from 'aws-cdk-lib';
import * as blueprints from '@aws-quickstart/eks-blueprints';
import * as process from "process";

const app = new cdk.App();

const karpenterAddonProps = {
    provisionerSpecs: {
        'node.kubernetes.io/instance-type': [
            'a1.medium',
            'a1.large',
            'a1.xlarge',
            'a1.2xlarge',
            'a1.4xlarge',
            'c5.large',
            'c5.xlarge',
            'c5.2xlarge',
            'c5.4xlarge',
            'd3.xlarge',
            'd3.2xlarge',
            'd3.4xlarge',
            'm5.large',
            'm5a.large',
            'm5.xlarge',
            'm5a.xlarge',
            'm5.2xlarge',
            'm5a.2xlarge',
            'm5.4xlarge',
            'm5a.4xlarge',
            't3.nano',
            't3.micro',
            't3.small',
            't3.medium',
            't3.large',
            't3.xlarge',
            't3.2xlarge'
        ],
        'topology.kubernetes.io/zone': ['us-west-2a'],
        'kubernetes.io/arch': ['amd64','arm64'],
        'karpenter.sh/capacity-type': ['spot','on-demand'],
    },
    subnetTags: {
        'aws:cloudformation:stack-name': 'EKS',
    },
    securityGroupTags: {
        'aws:eks:cluster-name': 'EKS',
    },
};

const addOns: Array<blueprints.ClusterAddOn> = [
    new blueprints.addons.CalicoOperatorAddOn(),
    new blueprints.addons.MetricsServerAddOn,
    new blueprints.addons.ContainerInsightsAddOn,
    new blueprints.addons.AwsLoadBalancerControllerAddOn(),
    new blueprints.addons.VpcCniAddOn(),
    new blueprints.addons.CoreDnsAddOn(),
    new blueprints.addons.KubeProxyAddOn(),
    new blueprints.addons.XrayAddOn(),
    new blueprints.addons.AwsNodeTerminationHandlerAddOn(),
    new blueprints.addons.KarpenterAddOn(karpenterAddonProps),
];

blueprints.EksBlueprint.builder()
    .version(aws_eks.KubernetesVersion.V1_21)
    .region(process.env.REGION)
    .account(process.env.ACCOUNT)
    .addOns(...addOns)
    .build(app, 'EKS');

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.32.0

EKS Blueprints Version

1.1.0

Node.js Version

v16.13.2

Environment details (OS name and version, etc.)

Debian sid

Other information

Thanks in advance for any help on this.

jdwil avatar Jul 22 '22 15:07 jdwil

@jdwil thanks for submitting the issue. AWS Node Termination Handler is only available for self-managed nodegroups on EKS. From your setup, you are using a default nodegroup, which is a managed nodegroup.

To deploy EKS Blueprints with self-managed nodegroup, please take a look here.

youngjeong46 avatar Jul 26 '22 22:07 youngjeong46

@youngjeong46 Why cdk-eks-blueprints does not support to use managed nodegroups to install NTH? if we don't use eks-blueprints and just launch EKS cluster and managed nodegroups, the NTH can be installed by using helm-chart

Furthermore, we cannot full control in The Auto Scaling Group Cluster Provider through cdk-eks-blueprints such as instance capacity type - spot-instance

vumdao avatar Jul 28 '22 05:07 vumdao

Same ticket https://github.com/aws-quickstart/cdk-eks-blueprints/issues/387

vumdao avatar Jul 28 '22 05:07 vumdao

@jdwil you would vote for this ticket https://github.com/aws-quickstart/cdk-eks-blueprints/issues/392 if you're installing NTH

vumdao avatar Jul 28 '22 05:07 vumdao

@youngjeong46 Thank you for the link. I will try switching to ASG as soon as I can. I still don't quite understand why this rule of not allowing NTH with MNG's is applicable when you're installing Karpenter. I have a managed node group, but karpenter scales the nodes and it does not use a node group. I'm very new to all this, so it's probably something obvious I'm missing.

jdwil avatar Jul 28 '22 10:07 jdwil

@jdwil it is a good point with Karpenter. Initially there was no restriction with respect to the NTH. It was a separate issue raised against NTH and MNG to add it. We will review it and the easiest path seems to be just relax the constraint.

shapirov103 avatar Jul 28 '22 14:07 shapirov103

@youngjeong46 can we relax this constraint? I'm running into the same issue with a cluster that only runs Fargate+Karpenter.

javydekoning avatar Oct 11 '22 14:10 javydekoning

@javydekoning

This has been resolved - NTH is supported for Karpenter up to v0.19, when native interruption handling is provided. Closing the issue.

youngjeong46 avatar Jan 12 '23 19:01 youngjeong46