terraform-aws-eks 17.24.0 => 18.20.5 upgrade causes destruction of cluster iam

Description

Hello I'm trying to upgrade my EKS Module from 17.24.0 to 18.20.2 but I'm encountering issues where it tries to destroy resources and if I follow-through with the apply it causes Cycle Error for terraform

[✅] ✋ I have searched the open/closed issues and my issue is not listed.

Versions

Module version [Required]: Current: 17.24.0 Target: 18.20.5
Terraform version:

= 0.12
Provider version(s): kubernetes-2.10.0 aws - 3.75.1

17.24.0 Code

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version         = "17.24.0"

  cluster_name    = "sf-${var.region}-${var.Environment}"
  cluster_version = "1.19"
  vpc_id          = "vpc-026e5737d5686b491"
  subnets         = data.aws_subnet_ids.private.ids
  map_roles       = local.map_roles
  map_users       = local.map_users 
}

18.20.5 Code

module "eks" {
  source  = "terraform-aws-modules/eks/aws"
  version = "18.20.5"

  cluster_name    = "${var.region}-${var.Environment}"
  cluster_version = "1.19"

  vpc_id          = "vpc-xxxxxxxx"
  subnet_ids      = data.aws_subnet_ids.private.ids
  
  aws_auth_roles = local.map_roles
  aws_auth_users = local.map_users
  create_aws_auth_configmap = true
  manage_aws_auth_configmap = true

################################################################################
# Required Values to prevent cluster destruction
################################################################################

  ################################################################################
  # plan.txt settings
  ################################################################################

 /*    create_cloudwatch_log_group               = false
    cluster_enabled_log_types                 = []
    prefix_separator                          = ""
    iam_role_name                             = "${var.region}-${var.Environment}"
    cluster_security_group_name               = "${var.region}-${var.Environment}"
    cluster_security_group_description        = "EKS cluster security group."
 */
  ################################################################################
  # plan1.txt settings
  ################################################################################

/*     create_cloudwatch_log_group               = false
    cluster_enabled_log_types                 = []
    prefix_separator                          = ""
    iam_role_name                             = "${var.region}-${var.Environment}"
    cluster_security_group_name               = "${var.region}-${var.Environment}"
    cluster_security_group_description        = "EKS cluster security group."
    iam_role_arn                              = "arn:aws:iam::xxxxxxxx:role/us-west-2-xxxxxxxxx"

 */
  ################################################################################
  # plan2.txt settings
  ################################################################################
  
    create_cloudwatch_log_group               = false
    cluster_enabled_log_types                 = []
    prefix_separator                          = ""
    iam_role_name                             = "${var.region}-${var.Environment}"
    cluster_security_group_name               = "${var.region}-${var.Environment}"
    cluster_security_group_description        = "EKS cluster security group."
    iam_role_arn                              = "arn:aws:iam::xxxxxxxx:role/us-west-2-xxxxxxxxx"
    create_iam_role                           = false

################################################################################
# Required Values to prevent cluster destruction
################################################################################

}

Locals.tf


locals {
  map_roles = [
    {
      rolearn  = "arn:aws:iam::${var.account}:role/${var.region}-${var.Environment}-admin"
      username = "${var.region}-${var.Environment}-admin"
      groups   = ["system:masters"]
    },
    {
      rolearn  = "arn:aws:iam::${var.account}:role/${var.region}-${var.Environment}-edit"
      username = "${var.region}-${var.Environment}-edit"
      groups   = ["xxxx"]
    },
    {
      rolearn  = "arn:aws:iam::${var.account}:role/${var.region}-${var.Environment}-read"
      username = "${var.region}-${var.Environment}-read"
      groups   = ["xxxx"]
    }

  ]

    map_users = [
    {
      userarn  = "arn:aws:iam::${var.account}:user/xxxx"
      username = "xxxx"
      groups   = ["system:masters"]
    },
    {
      userarn  = "arn:aws:iam::xxxxxxxxx:user/xxxx"
      username = "xxxx"
      groups   = ["system:masters"]
    }
  ]
}

Steps to reproduce the behavior:

Apply changes for the module upgrade.
terraform init -upgrade; terraform plan

Expected behavior

No destruction of the cluster and the IAM Role.

Actual behavior

The first and second configuration I used (under plan.txt and plan1.txt comment). forces replacement of the cluster iam_role, thus it triggers a cluster recreation
The third configuration (under plan2.txt comment). Successfully migrates to the new module but it destroys the cluster iam_role.

plan.txt plan1.txt plan2.txt

May 13 '22 07:05 jdomantay

There is an update coming to the migration docs - check out the WIP here https://github.com/clowdhaus/eks-v17-v18-migrate and let me know if this helps clarify how to handle this

May 13 '22 15:05 bryantbiggs

Has the above migration doc been moved to eks module repo? There are steps to change terraform states for node pool (using terrform state mv). I see others uses make terraform to forget old pool (terraform state rm). I wonder which is better to avoid downtime. Thanks @bryantbiggs

Jun 01 '22 18:06 xueshanf

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

Jul 02 '22 00:07 github-actions[bot]

bump so this doesnt go stale

Jul 04 '22 16:07 drmaples

I can confirm that I am seeing this same issue even without a migration from 17 -> 18

The issue appears to be that that the aws_auth_configmap_data isn't being fully encoded as yaml. In this example the output test appears to be valid where as test2 (which is how the configmap resource in this module is defined) is kind of funky

locals {
  aws_auth_configmap_data = {
    mapRoles    = yamlencode(local.aws_auth_roles)
    mapUsers    = yamlencode(local.aws_auth_users)
    mapAccounts = yamlencode(local.aws_auth_accounts)
  }
  aws_auth_roles = [
    {
      rolearn  = "arn:aws:iam::000000000000:role/AdminRole"
      username = "admin"
      groups   = ["system:masters"]
    },
  ]
  aws_auth_users = [
    {
      userarn  = "arn:aws:iam::000000000000:user/steve"
      username = "steve"
      groups   = ["system:masters"]
    },
    {
      userarn  = "arn:aws:iam::000000000000:user/bob"
      username = "bob"
      groups   = ["system:masters"]
    },
  ]
  aws_auth_accounts = [
    "000000000000"
  ]
}

output "test" {
  value = yamlencode(local.aws_auth_configmap_data)
}

output "test2" {
  value = local.aws_auth_configmap_data
}

➜  t git:(develop) ✗  terraform plan

Changes to Outputs:
  + test  = <<-EOT
        "mapAccounts": |
          - "000000000000"
        "mapRoles": |
          - "groups":
            - "system:masters"
            "rolearn": "arn:aws:iam::000000000000:role/AdminRole"
            "username": "admin"
        "mapUsers": |
          - "groups":
            - "system:masters"
            "userarn": "arn:aws:iam::000000000000:user/steve"
            "username": "steve"
          - "groups":
            - "system:masters"
            "userarn": "arn:aws:iam::000000000000:user/bob"
            "username": "bob"
    EOT
  + test2 = {
      + mapAccounts = <<-EOT
            - "000000000000"
        EOT
      + mapRoles    = <<-EOT
            - "groups":
              - "system:masters"
              "rolearn": "arn:aws:iam::000000000000:role/AdminRole"
              "username": "admin"
        EOT
      + mapUsers    = <<-EOT
            - "groups":
              - "system:masters"
              "userarn": "arn:aws:iam::000000000000:user/steve"
              "username": "steve"
            - "groups":
              - "system:masters"
              "userarn": "arn:aws:iam::000000000000:user/bob"
              "username": "bob"
        EOT
    }

Jul 07 '22 23:07 sfozz

OK figured this out...

I'd defined a variable for the roles and users as:

variable "aws_map_users" {
  type = list(object({}))
}

which meant that the list of users gets changed to:

- {}
- {}

Changing the var to

variable "aws_map_users" {
  type = list(any)
}

fixes this

Jul 08 '22 02:07 sfozz

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

Aug 08 '22 00:08 github-actions[bot]

This issue was automatically closed because of stale in 10 days

Aug 18 '22 00:08 github-actions[bot]

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Nov 09 '22 02:11 github-actions[bot]

terraform-aws-eks
terraform-aws-eks copied to clipboard

17.24.0 => 18.20.5 upgrade causes destruction of cluster iam_role

Description

Versions

17.24.0 Code

18.20.5 Code

Locals.tf

Steps to reproduce the behavior:

Expected behavior

Actual behavior

terraform-aws-eks terraform-aws-eks copied to clipboard

17.24.0 => 18.20.5 upgrade causes destruction of cluster iam_role

Description

Versions

17.24.0 Code

18.20.5 Code

Locals.tf

Steps to reproduce the behavior:

Expected behavior

Actual behavior

terraform-aws-eks
terraform-aws-eks copied to clipboard