terraform-aws-eks-blueprints icon indicating copy to clipboard operation
terraform-aws-eks-blueprints copied to clipboard

create_launch_template breaks cluster DNS

Open vprus opened this issue 2 years ago • 2 comments

I have attempted to create a managed node group with the following definition

    infra = {
      node_group_name = "infra"
      instance_types = ["m6i.2xlarge"]
      create_launch_template = true
      desired_size    = 1
      min_size        = 1
      max_size        = 1
      subnet_ids      = [module.vpc.private_subnets[0]]
      k8s_labels = {
        purpose = "infra"
      }
      k8s_taints = [
        { key = "purpose", value = "infra", effect = "NO_SCHEDULE" }
      ]
    }

While it works, pods scheduled on this node group cannot DNS resolve any services by name. I have verified it by following the steps from https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#create-a-simple-pod-to-use-as-a-test-environment:

  • Downloaded the pod yaml
  • Added tolerations and affinity so that it runs on the node group

Then, I run

kubectl -n kafka exec -i -t dnsutils -- nslookup -debug my-cluster-zookeeper-client

and get

% kubectl -n kafka exec -i -t dnsutils -- nslookup -debug my-cluster-zookeeper-client
;; connection timed out; no servers could be reached

Recreating node group without "create_launch_template" fixes the issue.

For the context, I need create_launch_template so that I can use pre_userdata, so that I can mess with local storage.

It seems that enabling create_launch_template requires that I specify some additional things, but reading documentation, I'ts not clear what exactly.

vprus avatar Sep 14 '22 16:09 vprus

when you use the default launch template created by the EKS managed node group service, it will attach the cluster's primary security group. when you use the custom launch template, it will attach the security groups created by the module. to ensure consistency, you should be consistent with your selection of using the default launch template or custom launch template

bryantbiggs avatar Sep 15 '22 14:09 bryantbiggs

That means, that if I started to use create_launch_template=true for any nodegroup, I should switch other existing node groups too? I guess it's an important constraint that could be added to the documentation, then?

vprus avatar Sep 15 '22 14:09 vprus

That means, that if I started to use create_launch_template=true for any nodegroup, I should switch other existing node groups too?

That depends on the outcome you are trying to achieve. The guidance was more general in that if you want your node groups to use consistent set(s) of security groups, you will need to configure your code in such a way to match that expectation.

I guess it's an important constraint that could be added to the documentation, then?

It is documented https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html

bryantbiggs avatar Sep 15 '22 14:09 bryantbiggs

I'd respectfully disagree. That documentation is the general EKS documentation. The fact that using create_launch_template =true in EKS Blueprints makes the launch template use a different security group that cannot communicate with other node group is behaviour specific to EKS Blueprints, and I think it violates the principle of least surprise, so better be documented.

vprus avatar Sep 15 '22 14:09 vprus

This is not specific to EKS Blueprints, it is the behavior of the EKS Managed Node Group service and that is why I linked the appropriate documentation

bryantbiggs avatar Sep 15 '22 14:09 bryantbiggs

Isn't it possible for EKS Blueprints, when creating launch template, to use cluster's primary security group by default?

vprus avatar Sep 15 '22 15:09 vprus

It could do that, yes, but it currently does not make that assumption. When users elect to use a custom launch template, it is left up to users to prescribe their intended settings via the custom launch template.

tl;dr

  • Using default launch template means EKS managed node group provides assumptions via default and/or prescriptive settings (i.e. - use cluster primary security group)
  • Using custom launch template means users provide their own settings and the module prescribes only the bare minimum settings to successfully create a cluster

bryantbiggs avatar Sep 15 '22 16:09 bryantbiggs

closing out with guidance provided above - any further questions, please feel free to open a new issue

bryantbiggs avatar Oct 08 '22 12:10 bryantbiggs