terraform-aws-eks
terraform-aws-eks copied to clipboard
Custom AMI: node groups fail updating user_data because it is created not MIME formatted
Description
tl;dr Amazon EC2 user data in launch templates that are used with EKS managed node groups must be in the MIME multi-part archive format. But setting the enable_bootstrap_user_data: true flag always creates it unformatted.
There is also a misconfiguration in templates/linux_user_data.tpl which always inserts pre_bootstrap_user_data, even if enable_bootstrap_user_data: false. The latter setting should control insertion of pre_bootstrap_user_data just like it already does for post_bootstrap_user_data.
Now the longer version. When deploying EKS node groups from launch templates with custom AMI types (platform "linux") and either EKS unmanaged, or with cloud-init user data provided, that bootstrap data will be pushed to AWS APIs as is, not MIME formatted. That is because the MIME-wrapping module cloudinit_config" "linux_eks_managed_node_group" will NOT be called in such scenarios.
Flags is_eks_managed_node_group: false and/or enable_bootstrap_user_data : true disable the module triggering condition (is expected for the former perhaps, but not for the latter!) var.create && var.platform == "linux" && var.is_eks_managed_node_group && !var.enable_bootstrap_user_data && var.pre_bootstrap_user_data != "" && var.user_data_template_path == "".
That is a problem, since AWS API accepts that not-MIME formatted user data on create request, and fails later, when updating that node group's launch template (Ec2LaunchTemplateInvalidConfiguration: User data was not in the MIME multipart format).
Related https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1875 Related https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2059 Related https://github.com/hashicorp/terraform-provider-aws/issues/15007
- [x] β I have searched the open/closed issues and my issue is not listed.
β οΈ Note
Before you submit an issue, please perform the following first:
- Remove the local
.terraformdirectory (! ONLY if state is stored remotely, which hopefully you are following that best practice!):rm -rf .terraform/ - Re-initialize the project root to pull down modules:
terraform init - Re-attempt your terraform plan or apply and check if the issue still persists
Versions
-
Module version [Required]: v18.23.0
-
Terraform version:
Terraform v1.2.3
on linux_amd64
- Provider version(s):
+ provider registry.terraform.io/hashicorp/aws v4.21.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.12.0
+ provider registry.terraform.io/hashicorp/tls v3.4.0
Reproduction Code [Required]
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = ">= 3.72"
}
}
required_version = ">= 0.13.1"
}
provider "aws" {
# Must match the profile name in your ~/.okta_aws_login_config file
profile = "<profile>"
region = "us-east-1"
}
data "aws_caller_identity" "current" {}
locals {
name = "test-1"
cluster_version = "1.20"
region = "us-east-1"
}
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "3.12.0"
name = local.name
cidr = "10.0.0.0/16"
azs = ["us-east-1a", "us-east-1b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
single_nat_gateway = true
one_nat_gateway_per_az = false
}
resource "tls_private_key" "this" {
algorithm = "RSA"
}
resource "aws_key_pair" "this" {
key_name_prefix = local.name
public_key = tls_private_key.this.public_key_openssh
}
module "eks" {
source = "terraform-aws-modules/eks/aws"
version = "18.23.0"
cluster_name = local.name
cluster_version = local.cluster_version
cluster_endpoint_private_access = true
cluster_endpoint_public_access = true
cluster_endpoint_public_access_cidrs = [<cidrs redacted>]
vpc_id = module.vpc.vpc_id
subnet_ids = concat(
module.vpc.private_subnets,
module.vpc.public_subnets,
)
eks_managed_node_group_defaults = {
ami_type = "AL2_x86_64"
platform = "linux"
disk_size = 50
instance_types = ["m5.large"]
enable_bootstrap_user_data = true
pre_bootstrap_user_data = <<-EOT
export CONTAINER_RUNTIME="containerd"
export USE_MAX_PODS=false
EOT
bootstrap_extra_args = "--container-runtime containerd --kubelet-extra-args '--max-pods=20 --instance-type t3a.large'"
post_bootstrap_user_data = <<-EOT
echo "All done"
EOT
}
eks_managed_node_groups = {
# Default node group - as provided by AWS EKS
default_node_group = {
}
}
}
workspaces: No
cleared the local cache: Yes
Expected behavior
- When
enable_bootstrap_user_data : true, user data should be defined in MIME format - When
enable_bootstrap_user_data : false, user data should not containpre_bootstrap_user_data, norpost_bootstrap_user_data - When self-managed nodes
is_eks_managed_node_group: false, there is still should be a way to control how to put user data into launch templates, either MIME formatted or not
Actual behavior
- When
enable_bootstrap_user_data : true, user data will not be defined in MIME format - When
enable_bootstrap_user_data : false, user data will containpre_bootstrap_user_data, but it will not containpost_bootstrap_user_data - When self-managed nodes
is_eks_managed_node_group: false, user data can not be defined in MIME format
Workarounds
- the EKS managed node group service will set this for you (no need for user data, and no custom AMIs in use)
- exploiting the issue in the template and setting
enable_bootstrap_user_data : falseto have pre_bootstrap_user_data injected in MIME format
Terminal Output Screenshot(s)
Additional context
@bryantbiggs PTAL
@bogdando have you read the documentation and look at what the examples produce? I believe this is a lack of understanding of the settings - the current user data settings are correct
@bryantbiggs there is at least an issue with documented enable_bootstrap_user_data case. This issue covers that misbehave as well:
...in
templates/linux_user_data.tplwhich always insertspre_bootstrap_user_data, even ifenable_bootstrap_user_data: false. The latter setting should control insertion ofpre_bootstrap_user_datajust like it already does forpost_bootstrap_user_data.
Another problem is that conditions controlling application of MIME tags for the user data are too restrictive: setting the enable_bootstrap_user_data: true flag always creates it unformatted, while in the docs there is this in the summary:
If users supply an
ami_id(the issue submitter note: orami_type), the service no longer supplies user data to bootstrap nodes; users can enableenable_bootstrap_user_dataand use the module provided user data template, or provide their own user data template
But when users enable that flag then provide that data template, or use the shipped one, it will be not MIME formatted, hence that issue arises. Have you read the issue description carefully and tried the reproducing steps?
The user data is working as intended and we have both an internal module as well as an example dedicated just to handling user data that demonstrates this
@bryantbiggs hi, as you have closed this one, have you checked the reported issue subjects perchance, or should I still find a reproducer for it:
Amazon EC2 user data in launch templates that are used with EKS managed node groups must be in the MIME multi-part archive format. But setting the enable_bootstrap_user_data: true flag always creates it unformatted.
There is also a misconfiguration in templates/linux_user_data.tpl which always inserts pre_bootstrap_user_data, even if enable_bootstrap_user_data: false. The latter setting should control insertion of pre_bootstrap_user_data just like it already does for post_bootstrap_user_data.
I split the 2nd issue into a dedicated one https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2246 Let me update the reproducer and description for the remaining MIME formatting issue only
@bryantbiggs updated! reproducer clearly shows an issue. Please reopen
@bryantbiggs could you please take a look "Terminal Output Screenshot(s)" and please reopen it? There is clearly an issue, and the updated reproducer clearly shows a failure