terraform-aws-eks-blueprints
terraform-aws-eks-blueprints copied to clipboard
create_launch_template breaks cluster DNS
I have attempted to create a managed node group with the following definition
infra = {
node_group_name = "infra"
instance_types = ["m6i.2xlarge"]
create_launch_template = true
desired_size = 1
min_size = 1
max_size = 1
subnet_ids = [module.vpc.private_subnets[0]]
k8s_labels = {
purpose = "infra"
}
k8s_taints = [
{ key = "purpose", value = "infra", effect = "NO_SCHEDULE" }
]
}
While it works, pods scheduled on this node group cannot DNS resolve any services by name. I have verified it by following the steps from https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/#create-a-simple-pod-to-use-as-a-test-environment:
- Downloaded the pod yaml
- Added tolerations and affinity so that it runs on the node group
Then, I run
kubectl -n kafka exec -i -t dnsutils -- nslookup -debug my-cluster-zookeeper-client
and get
% kubectl -n kafka exec -i -t dnsutils -- nslookup -debug my-cluster-zookeeper-client
;; connection timed out; no servers could be reached
Recreating node group without "create_launch_template" fixes the issue.
For the context, I need create_launch_template so that I can use pre_userdata, so that I can mess with local storage.
It seems that enabling create_launch_template requires that I specify some additional things, but reading documentation, I'ts not clear what exactly.
when you use the default launch template created by the EKS managed node group service, it will attach the cluster's primary security group. when you use the custom launch template, it will attach the security groups created by the module. to ensure consistency, you should be consistent with your selection of using the default launch template or custom launch template
That means, that if I started to use create_launch_template=true for any nodegroup, I should switch other existing node groups too? I guess it's an important constraint that could be added to the documentation, then?
That means, that if I started to use create_launch_template=true for any nodegroup, I should switch other existing node groups too?
That depends on the outcome you are trying to achieve. The guidance was more general in that if you want your node groups to use consistent set(s) of security groups, you will need to configure your code in such a way to match that expectation.
I guess it's an important constraint that could be added to the documentation, then?
It is documented https://docs.aws.amazon.com/eks/latest/userguide/launch-templates.html
I'd respectfully disagree. That documentation is the general EKS documentation. The fact that using create_launch_template =true in EKS Blueprints makes the launch template use a different security group that cannot communicate with other node group is behaviour specific to EKS Blueprints, and I think it violates the principle of least surprise, so better be documented.
This is not specific to EKS Blueprints, it is the behavior of the EKS Managed Node Group service and that is why I linked the appropriate documentation
Isn't it possible for EKS Blueprints, when creating launch template, to use cluster's primary security group by default?
It could do that, yes, but it currently does not make that assumption. When users elect to use a custom launch template, it is left up to users to prescribe their intended settings via the custom launch template.
tl;dr
- Using default launch template means EKS managed node group provides assumptions via default and/or prescriptive settings (i.e. - use cluster primary security group)
- Using custom launch template means users provide their own settings and the module prescribes only the bare minimum settings to successfully create a cluster
closing out with guidance provided above - any further questions, please feel free to open a new issue