terraform-aws-eks icon indicating copy to clipboard operation
terraform-aws-eks copied to clipboard

Custom AMI: node groups fail updating user_data because it is created not MIME formatted

Open bogdando opened this issue 3 years ago β€’ 3 comments

Description

tl;dr Amazon EC2 user data in launch templates that are used with EKS managed node groups must be in the MIME multi-part archive format. But setting the enable_bootstrap_user_data: true flag always creates it unformatted.

There is also a misconfiguration in templates/linux_user_data.tpl which always inserts pre_bootstrap_user_data, even if enable_bootstrap_user_data: false. The latter setting should control insertion of pre_bootstrap_user_data just like it already does for post_bootstrap_user_data.

Now the longer version. When deploying EKS node groups from launch templates with custom AMI types (platform "linux") and either EKS unmanaged, or with cloud-init user data provided, that bootstrap data will be pushed to AWS APIs as is, not MIME formatted. That is because the MIME-wrapping module cloudinit_config" "linux_eks_managed_node_group" will NOT be called in such scenarios.

Flags is_eks_managed_node_group: false and/or enable_bootstrap_user_data : true disable the module triggering condition (is expected for the former perhaps, but not for the latter!) var.create && var.platform == "linux" && var.is_eks_managed_node_group && !var.enable_bootstrap_user_data && var.pre_bootstrap_user_data != "" && var.user_data_template_path == "".

That is a problem, since AWS API accepts that not-MIME formatted user data on create request, and fails later, when updating that node group's launch template (Ec2LaunchTemplateInvalidConfiguration: User data was not in the MIME multipart format).

Related https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1875 Related https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2059 Related https://github.com/hashicorp/terraform-provider-aws/issues/15007

  • [x] βœ‹ I have searched the open/closed issues and my issue is not listed.

⚠️ Note

Before you submit an issue, please perform the following first:

  1. Remove the local .terraform directory (! ONLY if state is stored remotely, which hopefully you are following that best practice!): rm -rf .terraform/
  2. Re-initialize the project root to pull down modules: terraform init
  3. Re-attempt your terraform plan or apply and check if the issue still persists

Versions

  • Module version [Required]: v18.23.0

  • Terraform version:

Terraform v1.2.3
on linux_amd64
  • Provider version(s):
+ provider registry.terraform.io/hashicorp/aws v4.21.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.12.0
+ provider registry.terraform.io/hashicorp/tls v3.4.0

Reproduction Code [Required]

terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 3.72"
    }
  }
  required_version = ">= 0.13.1"
}

provider "aws" {
  # Must match the profile name in your ~/.okta_aws_login_config file
  profile = "<profile>"
  region  = "us-east-1"
}

data "aws_caller_identity" "current" {}

locals {
  name            = "test-1"
  cluster_version = "1.20"
  region          = "us-east-1"
}

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "3.12.0"

  name = local.name
  cidr = "10.0.0.0/16"

  azs             = ["us-east-1a", "us-east-1b"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24"]

  enable_nat_gateway     = true
  single_nat_gateway     = true
  one_nat_gateway_per_az = false
}

resource "tls_private_key" "this" {
  algorithm = "RSA"
}

resource "aws_key_pair" "this" {
  key_name_prefix = local.name
  public_key      = tls_private_key.this.public_key_openssh
}

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "18.23.0"

  cluster_name                    = local.name
  cluster_version                 = local.cluster_version
  cluster_endpoint_private_access = true
  cluster_endpoint_public_access  = true

  cluster_endpoint_public_access_cidrs = [<cidrs redacted>]

  vpc_id     = module.vpc.vpc_id
  subnet_ids = concat(
    module.vpc.private_subnets,
    module.vpc.public_subnets,
  )

  eks_managed_node_group_defaults = {
    ami_type = "AL2_x86_64"
    platform = "linux"
    disk_size = 50
    instance_types = ["m5.large"]
    enable_bootstrap_user_data = true
    pre_bootstrap_user_data = <<-EOT
      export CONTAINER_RUNTIME="containerd"
      export USE_MAX_PODS=false
      EOT
    bootstrap_extra_args = "--container-runtime containerd --kubelet-extra-args '--max-pods=20 --instance-type t3a.large'"
    post_bootstrap_user_data = <<-EOT
      echo "All done"
      EOT
  }

  eks_managed_node_groups = {
    # Default node group - as provided by AWS EKS
    default_node_group = {
    }
  }
}

workspaces: No

cleared the local cache: Yes

Expected behavior

  • When enable_bootstrap_user_data : true, user data should be defined in MIME format
  • When enable_bootstrap_user_data : false, user data should not contain pre_bootstrap_user_data, nor post_bootstrap_user_data
  • When self-managed nodes is_eks_managed_node_group: false, there is still should be a way to control how to put user data into launch templates, either MIME formatted or not

Actual behavior

  • When enable_bootstrap_user_data : true, user data will not be defined in MIME format
  • When enable_bootstrap_user_data : false, user data will contain pre_bootstrap_user_data, but it will not contain post_bootstrap_user_data
  • When self-managed nodes is_eks_managed_node_group: false, user data can not be defined in MIME format

Workarounds

  • the EKS managed node group service will set this for you (no need for user data, and no custom AMIs in use)
  • exploiting the issue in the template and setting enable_bootstrap_user_data : false to have pre_bootstrap_user_data injected in MIME format

Terminal Output Screenshot(s)

Additional context

bogdando avatar Jul 06 '22 07:07 bogdando

@bryantbiggs PTAL

bogdando avatar Jul 11 '22 06:07 bogdando

@bogdando have you read the documentation and look at what the examples produce? I believe this is a lack of understanding of the settings - the current user data settings are correct

bryantbiggs avatar Jul 17 '22 12:07 bryantbiggs

@bryantbiggs there is at least an issue with documented enable_bootstrap_user_data case. This issue covers that misbehave as well:

...in templates/linux_user_data.tpl which always inserts pre_bootstrap_user_data, even if enable_bootstrap_user_data: false. The latter setting should control insertion of pre_bootstrap_user_data just like it already does for post_bootstrap_user_data.

Another problem is that conditions controlling application of MIME tags for the user data are too restrictive: setting the enable_bootstrap_user_data: true flag always creates it unformatted, while in the docs there is this in the summary:

If users supply an ami_id (the issue submitter note: or ami_type), the service no longer supplies user data to bootstrap nodes; users can enable enable_bootstrap_user_data and use the module provided user data template, or provide their own user data template

But when users enable that flag then provide that data template, or use the shipped one, it will be not MIME formatted, hence that issue arises. Have you read the issue description carefully and tried the reproducing steps?

bogdando avatar Jul 26 '22 06:07 bogdando

The user data is working as intended and we have both an internal module as well as an example dedicated just to handling user data that demonstrates this

bryantbiggs avatar Aug 20 '22 20:08 bryantbiggs

@bryantbiggs hi, as you have closed this one, have you checked the reported issue subjects perchance, or should I still find a reproducer for it:

Amazon EC2 user data in launch templates that are used with EKS managed node groups must be in the MIME multi-part archive format. But setting the enable_bootstrap_user_data: true flag always creates it unformatted.

There is also a misconfiguration in templates/linux_user_data.tpl which always inserts pre_bootstrap_user_data, even if enable_bootstrap_user_data: false. The latter setting should control insertion of pre_bootstrap_user_data just like it already does for post_bootstrap_user_data.

bogdando avatar Sep 28 '22 07:09 bogdando

I split the 2nd issue into a dedicated one https://github.com/terraform-aws-modules/terraform-aws-eks/issues/2246 Let me update the reproducer and description for the remaining MIME formatting issue only

bogdando avatar Sep 28 '22 07:09 bogdando

@bryantbiggs updated! reproducer clearly shows an issue. Please reopen

bogdando avatar Sep 28 '22 08:09 bogdando

@bryantbiggs could you please take a look "Terminal Output Screenshot(s)" and please reopen it? There is clearly an issue, and the updated reproducer clearly shows a failure

bogdando avatar Sep 29 '22 07:09 bogdando