terraform-aws-eks
terraform-aws-eks copied to clipboard
eks_managed_group network_interfaces device_index unable to setup multiple nics correctly with a launch template
Description
When attempting to setup a managed node group containing an instance type that supports multiple NICs such as a p4d.24xlarge the launch template is setup incorrectly resulting nodes being unable to start
Versions
-
Module version: 19.15.3
-
Terraform version: 1.5.0
-
Provider version(s): Terraform v1.5.0 on darwin_arm64
- provider registry.terraform.io/hashicorp/aws v5.3.0
- provider registry.terraform.io/hashicorp/cloudinit v2.3.2
- provider registry.terraform.io/hashicorp/kubernetes v2.21.1
- provider registry.terraform.io/hashicorp/time v0.9.1
- provider registry.terraform.io/hashicorp/tls v4.0.4
Reproduction Code
I am using the https://github.com/terraform-aws-modules/terraform-aws-eks/tree/v19.15.3/examples/eks_managed_node_group and have replaced all nodes groups with this config
gpu_a100_80g = {
ami_type = "AL2_x86_64_GPU"
subnet_ids = [module.vpc.private_subnets[0]]
desired_size = 0
min_size = 0
max_size = 4
instance_types = ["p4d.24xlarge"]
tags = {
"eks.absci-ai.cloud/node-purpose" = "gpu_a100_80g"
}
labels = {
"eks.absci-ai.cloud/node-purpose" = "gpu_a100_80g"
"k8s.amazonaws.com/accelerator" = "nvidia-tesla-a100"
}
network_interfaces = [
{
description = "EFA interface 1"
delete_on_termination = true
device_index = 0
associate_public_ip_address = false
interface_type = "efa"
efa_enabled = true
},
{
description = "EFA interface 2"
delete_on_termination = true
device_index = 1
associate_public_ip_address = false
interface_type = "efa"
efa_enabled = true
},
{
description = "EFA interface 3"
delete_on_termination = true
device_index = 2
associate_public_ip_address = false
interface_type = "efa"
efa_enabled = true
},
{
description = "EFA interface 4"
delete_on_termination = true
device_index = 3
associate_public_ip_address = false
interface_type = "efa"
efa_enabled = true
}
]
pre_bootstrap_user_data = <<-EOT
# Install EFA
curl -O https://efa-installer.amazonaws.com/aws-efa-installer-latest.tar.gz
tar -xf aws-efa-installer-latest.tar.gz && cd aws-efa-installer
./efa_installer.sh -y
/opt/amazon/efa/bin/fi_info -p efa -t FI_EP_RDM > /tmp/efa_info
# Disable ptrace
sysctl -w kernel.yama.ptrace_scope=0
EOT
}
}
Steps to reproduce the behavior:
Run the example above and then try and scale up the node group.
Expected behavior
Instance should be able to be start up.
Actual behavior
Unable to launch an instance due to incorrect NIC configurations in the launch config
Additional context
Here is a screen shot of what the network cards looks like with the incorrect index
And for reference here is what a working configuration looks like using eksctl that supports EFA and multiple NICs.