terraform-aws-eks icon indicating copy to clipboard operation
terraform-aws-eks copied to clipboard

Fail to update config map in v20

Open vchepkov opened this issue 1 year ago β€’ 16 comments

We are facing a challenge to configure config_map

We use gitlab runners running in a central AWS account to create/configure EKS clusters in a target account aws provider looks like this and module successfully creates EKS cluster:

provider "aws" {
  region = var.region
  assume_role {
    role_arn = "arn:aws:iam::${var.targetId}:role/gitlab-terraform-deployment"
  }
  default_tags {
    tags = var.tags
  }
}

To update config map we configure kubernetes provider:

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  exec {
    api_version = "client.authentication.k8s.io/v1"
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_name, "--role-arn", "arn:aws:iam::${var.targetId}:role/gitlab-terraform-deployment"]
  }
}

and use the submodule


module "eks-auth" {
  source = "github.com/terraform-aws-modules/terraform-aws-eks//modules/aws-auth?ref=v20.8.3"

  create_aws_auth_configmap = true
  manage_aws_auth_configmap = true

But this results in an error

 Error: Unauthorized
β”‚ 
β”‚   module.eks-auth.kubernetes_config_map.aws_auth[0],
β”‚   on .terraform/modules/eks.eks-auth/modules/aws-auth/main.tf line 14, in resource "kubernetes_config_map" "aws_auth":
β”‚   14: resource "kubernetes_config_map" "aws_auth" {

What do we miss? Thank you

vchepkov avatar Mar 20 '24 14:03 vchepkov

does "arn:aws:iam::${var.targetId}:role/gitlab-terraform-deployment" have permission inside the cluster - for example, a cluster access entry that provides that role admin permissions in the cluster?

bryantbiggs avatar Mar 20 '24 14:03 bryantbiggs

I presume it does, since this is the role that creates cluster using aws provider?

vchepkov avatar Mar 20 '24 14:03 vchepkov

not necessarily - if you create a cluster today on v20.x, the IAM identity used to create the cluster does not have any access inside the cluster by default

bryantbiggs avatar Mar 20 '24 14:03 bryantbiggs

Oh, that must be the missing part. How do I grant that access on creation?

vchepkov avatar Mar 20 '24 15:03 vchepkov

https://github.com/terraform-aws-modules/terraform-aws-eks/blob/1627231af669796669ce83e0a4672a7e6d94a0b3/examples/karpenter/main.tf#L69-L71

bryantbiggs avatar Mar 20 '24 15:03 bryantbiggs

Karpenter confused me here, I will try, thanks

vchepkov avatar Mar 20 '24 15:03 vchepkov

in that example, the identity needs K8s permissions in order to deploy the Karpenter resources (inside the cluster)

bryantbiggs avatar Mar 20 '24 15:03 bryantbiggs

I moved further, but still fails:

β”‚ Error: configmaps is forbidden: User "arn:aws:sts::xxx:assumed-role/gitlab-terraform-deployment/EKSGetTokenAuth" cannot create resource "configmaps" in API group "" in the namespace "kube-system"

vchepkov avatar Mar 20 '24 15:03 vchepkov

I have tried to use CONFIG_MAP authentication role, but that seems conflicting with the module other logic

bootstrapClusterCreatorAdminPermissions must be true if cluster authentication mode is set to CONFIG_MAP

And this parameter is hardcoded,so I don't think CONFIG_MAP is a valid option?

vchepkov avatar Mar 20 '24 16:03 vchepkov

@vchepkov could you please open an AWS support ticket for this configmap permission issue and include your cluster ARN

bryantbiggs avatar Mar 20 '24 17:03 bryantbiggs

I am working with the support, but I think that for authentication modes API_AND_CONFIG_MAP and CONFIG_MAP we have to use bootstrapClusterCreatorAdminPermissions=true

Error posted above indicates that and blog https://aws.amazon.com/blogs/containers/a-deep-dive-into-simplified-amazon-eks-access-management-controls/ indicates that creator won't have any permissions if that's not the case, which I think is the problem that we see here

Support engineer suggested me to add aws_eks_access_entry resource to add iam role to kubernetes_groups = ["system:masters"] , but resource provider refuses:

InvalidParameterException: The kubernetes group name system:masters is invalid, it cannot start with system:

vchepkov avatar Mar 20 '24 20:03 vchepkov

you cannot add an entry where the group starts with "system:

For API and API_AND_CONFIG_MAP, this is what enable_cluster_creator_admin_permissions is providing in lieu of what was previously used in the aws-auth configmap. I've notified the team internally and they will take a look - this should provide sufficient permission to allow the entity to create/edit the configmap https://github.com/terraform-aws-modules/terraform-aws-eks/blob/1627231af669796669ce83e0a4672a7e6d94a0b3/main.tf#L142-L156

bryantbiggs avatar Mar 20 '24 20:03 bryantbiggs

Posting support recommendation here, I guess we need to add yet another sleep time, similar what is done for custom networking

Upon further investigation, it was discovered that the access entry association was created on 2024-03-20T18:25:37.929069Z and propagated by 2024-03-20T18:25:39.405Z. Additionally, your attempt to access the ConfigMap was logged at 2024-03-20T18:25:38.903Z. It's important to note that there is a 1.5-second delay for the association to take effect, which explains this expected behavior due to propagation delays.

Recommendation: Please wait for at least a few seconds after creating the access entry association to allow it to take effect before accessing the Kubernetes ConfigMap.

vchepkov avatar Mar 20 '24 23:03 vchepkov

Any (manual) workaround for this?

sbkg0002 avatar Apr 03 '24 08:04 sbkg0002

I added code similar to what this module already uses. IMHO, module should add this block too before declaring module as "ready".

# There is a time delay of 1.5 seconds between the EKS cluster being created and creators being able to interact with it.
# kubectl has retry logic, for kubernetes we need to wait
resource "time_sleep" "this" {
  create_duration = "3s"

  depends_on = [module.eks-cluster]
}

vchepkov avatar Apr 03 '24 10:04 vchepkov

This issue has been automatically marked as stale because it has been open 30 days with no activity. Remove stale label or comment or this issue will be closed in 10 days

github-actions[bot] avatar May 04 '24 00:05 github-actions[bot]

This issue was automatically closed because of stale in 10 days

github-actions[bot] avatar May 14 '24 00:05 github-actions[bot]

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Jun 13 '24 02:06 github-actions[bot]