terraform-aws-eks icon indicating copy to clipboard operation
terraform-aws-eks copied to clipboard

v20 does not support EKS cluster creation with authentication_mode=CONFIG_MAP

Open johnkeates opened this issue 1 year ago β€’ 6 comments

Description

Similarly to #2925 it is not possible to create a new EKS cluster with only CONFIG_MAP enabled for authentication.

InvalidParameterException: bootstrapClusterCreatorAdminPermissions must be true if cluster authentication mode is set to CONFIG_MAP

Versions

  • Module version: "20.8.5"
  • Terraform version: Terraform v1.5.1
  • Provider version(s): aws v5.48.0

Reproduction Code

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "20.8.5"
  cluster_name         = "test"
  cluster_version          =  "1.29"
  authentication_mode         = "CONFIG_MAP"

  cluster_endpoint_private_access         = true
  cluster_endpoint_public_access          = false
}

Steps to reproduce the behaviour: Apply and instantly get the failure.

Expected behavior

Create an EKS cluster with aws-auth only.

Actual behavior

No cluster because the EKS API doesn't allow this configuration while bootstrap_cluster_creator_admin_permissions is hardcoded to be false.

johnkeates avatar May 05 '24 23:05 johnkeates

Correct, it's not supported

bryantbiggs avatar May 05 '24 23:05 bryantbiggs

I suppose that means the true 'bug' would be this line: https://github.com/terraform-aws-modules/terraform-aws-eks/blob/afadb14e44d1cdbd852dbae815be377c4034e82a/variables.tf#L42

since this module does not support the first option listed.

johnkeates avatar May 05 '24 23:05 johnkeates

It's not a bug, it's a decision we have elected to take. Any particular reason you want your use the configmap only mode?

bryantbiggs avatar May 06 '24 02:05 bryantbiggs

For us there is a bit of a mix of reasons; but the main reason is we add role ARNs that do not exist (yet) during cluster seeding and that's not possible with access entries, and a secondary reason is that we seed aws-auth from terraform but then have ArgoCD take over the runtime configuration.

We generally have a group of AWS accounts, one EKS cluster per account, where a controlling cluster owns multiple workload clusters. The controlling cluster runs application management functions (like ArgoCD) and distributes configurations and deployments. The order in which the clusters and AWS accounts themselves are created is not deterministic but is eventually consistent.

The groups themselves are managed by a global ArgoCD instance, its role is to provision other ArgoCDs.

Terraform kickstarts the AWS account and VPC baseline, and then an EKS-specific payload is applied on top of that, also with terraform to deploy EKS and a minimal configuration to allow remote control of the cluster. This is done using simple string templating for the ARNs, which might not exist (yet) if an entire environment is being replaced, seeded or reset to whatever is in Git (in case someone made manual changes).

While we technically don't require ConfigMap-exclusive access management, we're not using access entries so when reading the documentation of the module we used the values that are documented to be supported, but the described value is actually not supported at all and thus the configuration cannot be applied. We also have been trying out a way to run aws-auth from a forked version that supports wildcards (for SSO roles), which also isn't supported by access entries.

The comment above bootstrap_cluster_creator_admin_permissions suggests this default was made because it's a one-time setting during cluster creation and therefore it's better to set it to false and use access entries instead since those can be changed at will, but the same applies to downgrading authentication_mode, which can also only be done at creation time.

So, in short: the documentation of this module says something can be configured a certain way, but it can't. It's fine if we can't, but in such a case the documentation should probably not contradict the EKS API or AWS provider. We ran into that because we currently don't use access entries, and IAM-wise we don't tend to enable things we don't use.

johnkeates avatar May 06 '24 17:05 johnkeates

So, in short: the documentation of this module says something can be configured a certain way, but it can't. It's fine if we can't, but in such a case the documentation should probably not contradict the EKS API or AWS provider. We ran into that because we currently don't use access entries, and IAM-wise we don't tend to enable things we don't use.

I don't follow - I don't see where we are contradicting with anything. if you have an existing cluster, that was created prior to v20 of this module, you can think of it as configmap only. you can upgrade to v20 and keep this same setting if you wish. however, on new clusters created starting with v20 and later, we are only supporting clusters that utilize access entry in API only mode, or API and configmap. and in the latter, it just means its possible to create access entries, but it doesn't mean you have to or that you aren't using something. this is why the default on v20 is API_AND_CONFIG_MAP

so in your scenario, I would stick with the v20 defaults, and keep using the configmap as you wish. the only scenario where this becomes an issue currently is with local clusters on outposts. otherwise, I don't see any issues with what we have provided

bryantbiggs avatar May 06 '24 19:05 bryantbiggs

So the fact that authentication_mode = "CONFIG_MAP" yields InvalidParameterException, yet is documented as being supported (and not documented as being upgrade-only) is not a [documentation] bug?

I figured either of two things can be true:

  1. CONFIG_MAP is supported as described

  2. CONFIG_MAP is not supported, contrary to what is described

Maybe this makes my comment seem pedantic, but this just is what we observed when upgrading from 19.x to 20.x.

Perhaps this is due to the nature of some of our clusters being wiped and re-created from a single codebase, causing that codebase to not only roll forwards (with in-place upgrades) but also used to reset to a known state. I know that from an upgrade-only perspective, https://github.com/terraform-aws-modules/terraform-aws-eks/blob/afadb14e44d1cdbd852dbae815be377c4034e82a/docs/UPGRADE-20.0.md?plain=1#L13 would apply perfectly fine.

Either way, this still doesn't allow us to create clusters with CONFIG_MAP only (unless we fall back to v19 first, and then go to v20 post-creation), regardless of whether this is a documentation specificity issue or not, but we can remedy that by modifying our process to just use API_AND_CONFIG_MAP and update all clusters or wait until they are rotated out.

Do I close this as a non-bug, or do I make a PR to update the documentation and description to reflect the fact that this doesn't work for new clusters?

johnkeates avatar May 06 '24 19:05 johnkeates

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

github-actions[bot] avatar Jun 15 '24 02:06 github-actions[bot]