terraform-aws-eks icon indicating copy to clipboard operation
terraform-aws-eks copied to clipboard

dial tcp 127.0.0.1:80: connect: connection refused

Open ArchiFleKs opened this issue 2 years ago β€’ 50 comments

Description

I know there are numerous issues (#817) related to this problem, but since v18.20.1 reintroduced the management of configmap thought we could discuss in a new one because the old ones are closed.

The behavior is till very weird. I updated my module to use the configmap management feature and the first run went fine (was using the aws_eks_cluster_auth datasource. When I run the module with no change I have no error either in plan or apply.

I then tried to update my cluster form v1.21 to v1.22 and then plan and apply began to fail with the following well know error:

null_resource.node_groups_asg_tags["m5a-xlarge-b-priv"]: Refreshing state... [id=7353592322772826167]                                                                                                                                    
β•·                                                                                                                                                                                                                                        
β”‚ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused                                                                                                    
β”‚                                                                                                                                                                                                                                        
β”‚   with kubernetes_config_map_v1_data.aws_auth[0],                                                                                                                                                                                      
β”‚   on main.tf line 428, in resource "kubernetes_config_map_v1_data" "aws_auth":                                                                                                                                                         
β”‚  428: resource "kubernetes_config_map_v1_data" "aws_auth" {                                                                                                                                                                            
β”‚                                                                                                                                                                                                                                        
β•΅                                                           

I then moved to the exec plugin as recommended per the documentation and removed from state the old datasource. Still go the same error.

Something I don't get is when setting the variable export KUBE_CONFIG_PATH=$PWD/kubeconfig as suggested in #817 things work as expected.

I'm sad to see things are still unusable (not related to this module but on the Kubernetes provider side), load_config_file option has been removed from Kubernetes provider for a while and I don't see why this variable needs to be set and how it could be set beforehand.

Anyway, if someone managed to use the readded feature of managing configmap I'd be glad to know how to workaround this and help debug this issue.

PS: I'm using Terragrunt, not sure if the issue could be related but it might

  • [X] βœ‹ I have searched the open/closed issues and my issue is not listed.

Versions

  • Module version [Required]:
Terraform v1.1.7
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.9.0
+ provider registry.terraform.io/hashicorp/cloudinit v2.2.0
+ provider registry.terraform.io/hashicorp/kubernetes v2.10.0
+ provider registry.terraform.io/hashicorp/null v3.1.1
+ provider registry.terraform.io/hashicorp/tls v3.3.0

Reproduce

Here is my provider block

provider "kubernetes" {
  host                   = data.aws_eks_cluster.cluster.endpoint
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.cluster.certificate_authority.0.data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", data.aws_eks_cluster.cluster.id]
  }
}

data "aws_eks_cluster" "cluster" {
  name = aws_eks_cluster.this[0].id
}

ArchiFleKs avatar Apr 11 '22 20:04 ArchiFleKs

I have the same issue but when I work with state with another AWS user , I'm got error like

Error: Unauthorized  

with module.eks.module.eks.kubernetes_config_map.aws_auth[0],   
on .terraform/modules/eks.eks/main.tf line 411, in resource "kubernetes_config_map" "aws_auth":
411: resource "kubernetes_config_map" "aws_auth" {

PLeS207 avatar Apr 11 '22 20:04 PLeS207

Would you try replacing aws_eks_cluster.this[0].id with the hard coded cluster name?

I guess aws_eks_cluster.this[0].id would be known after apply because you're going to bump up EKS cluster version. That's why the data resource is indeterminate, and kubernetes provider will fallback to default 127.0.0.1:80.

FeLvi-zzz avatar Apr 11 '22 21:04 FeLvi-zzz

Would you try replacing aws_eks_cluster.this[0].id with the hard coded cluster name?

I guess aws_eks_cluster.this[0].id would be known after apply because you're going to bump up EKS cluster version. That's why the data resource is indeterminate, and kubernetes provider will fallback to default 127.0.0.1:80.

not quite true - if the data source fails to find a result, its a failure not indeterminate.

@ArchiFleKs you shouldn't need the data source at all; does this still present the same issue?

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

bryantbiggs avatar Apr 11 '22 23:04 bryantbiggs

Would you try replacing aws_eks_cluster.this[0].id with the hard coded cluster name? I guess aws_eks_cluster.this[0].id would be known after apply because you're going to bump up EKS cluster version. That's why the data resource is indeterminate, and kubernetes provider will fallback to default 127.0.0.1:80.

not quite true - if the data source fails to find a result, its a failure not indeterminate.

@ArchiFleKs you shouldn't need the data source at all; does this still present the same issue?

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

you cant run these in tf cloud though, cause of the local exec

sergiofteixeira avatar Apr 12 '22 10:04 sergiofteixeira

Would you try replacing aws_eks_cluster.this[0].id with the hard coded cluster name? I guess aws_eks_cluster.this[0].id would be known after apply because you're going to bump up EKS cluster version. That's why the data resource is indeterminate, and kubernetes provider will fallback to default 127.0.0.1:80.

not quite true - if the data source fails to find a result, its a failure not indeterminate. @ArchiFleKs you shouldn't need the data source at all; does this still present the same issue?

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

you cant run these in tf cloud though, cause of the local exec

This is just merely pointing to what the Kubernetes provider documentation specifies. The module doesn't have any influence over this aspect

bryantbiggs avatar Apr 12 '22 10:04 bryantbiggs

I can confirm that this snippet works as expected without the datasource:

provider "kubernetes" {
  host                   = aws_eks_cluster.this[0].endpoint
  cluster_ca_certificate = base64decode(aws_eks_cluster.this[0].certificate_authority.0.data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", aws_eks_cluster.this[0].id]
  }
}

ArchiFleKs avatar Apr 12 '22 12:04 ArchiFleKs

I know Hashi are hiring and have made some hires to start offering more support to the Kubernetes and Helm providers recently so hopefully some of these quirks get resolved soon! for now, we can just keep sharing what others have found to have worked for their setups πŸ€·πŸ½β€β™‚οΈ

bryantbiggs avatar Apr 12 '22 12:04 bryantbiggs

Unfortunately, it doesn't seem to work with tf-cloud (it gets the Error: failed to create kubernetes rest client for read of resource: Get "http://localhost/api?timeout=32s": dial tcp 127.0.0.1:80: connect: connection refused error), I locked the module on v18.19 so it still works.

evenme avatar Apr 12 '22 15:04 evenme

Apparently using kubectl provider instead of kubernetes provider (even completely removing it) made it work with terraform-cloud πŸ€·β€β™€οΈ :

provider "kubectl" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.cluster.token
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    args        = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

but unfortunately this got the previously working aws-auth deleted and was not able to create one Error: The configmap "aws-auth" does not exist... :|

evenme avatar Apr 12 '22 17:04 evenme

I just ran into this while debugging an issue during redeployment of a cluster. I'm not sure exactly how it happened, but we ended up in a state where the cluster had been destroyed, which caused terraform to not be able to connect to the cluster (duh...) using the provider and such defaulted to 120.0.0.1 when trying to touch the config map...

As mentioned, I'm not sure exactly how it ended up in that state, but it got so bad that I'd get this dial tcp 127.0.0.1:80: connect: connection refused error on terraform plan even with all references to the config map removed. Turns out there was still a reference to the config map in the state file, so removing that using terraform state rm module.eks.this.kubernetes_config_map_v1_data.aws_auth allowed me to redeploy...

Maybe not applicable to most of you, but hopefully it's useful for someone in the future :D

MadsRC avatar Apr 19 '22 20:04 MadsRC

hey all - let me know if its still worthwhile to leave this issue open. I don't think there is anything further we can do here in this module to help alleviate any of the issues shown - there seems to be some variability in terms of what works or does not work for folks. I might be biased, but I think the best place to look at sourcing some improvements/resolution would be upstream with the other providers (Kubernetes, Helm, Kubectl, etc.)

bryantbiggs avatar Apr 22 '22 15:04 bryantbiggs

I'm also experiencing this, in the meantime are there any work arounds?

Im experiencing the same problem with the latest version. Initial creation of cluster worked fine but trying to update any resources after creation i get the same error.

β”‚ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
β”‚
β”‚   with module.eks.kubernetes_config_map_v1_data.aws_auth[0],
β”‚   on .terraform/modules/eks/main.tf line 431, in resource "kubernetes_config_map_v1_data" "aws_auth":
β”‚  431: resource "kubernetes_config_map_v1_data" "aws_auth" {
β”‚

Same as the example below except i had multiple profiles on my machine and had to specify the profile. https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/examples/eks_managed_node_group/main.tf#L5-L15

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)
  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id, "--profile", "terraformtest"]
  }
}

kaykhancheckpoint avatar Apr 25 '22 12:04 kaykhancheckpoint

Faced the same, then checked state using terraform state list and found k8s related entries there. Then I removed then using

terraform state rm module.eks.kubernetes_config_map.aws_auth[0]

And that helped to resolve the issue.

DimamoN avatar Apr 25 '22 12:04 DimamoN

The previous suggestions didin't work for me (maybe i misunderstood something)

  1. export KUBE_CONFIG_PATH=$PWD/kubeconfig

This kubeconfig does not appear to exist in my current path...

  1. Deleting the datasource

The latest version of this example and module does not use a datasource, instead just uses module.eks.cluster_id but still get this error.


i ended up deleting the aws_auth from the state, it allowed me to continue/resolve the connection refused problem.

terraform state rm 'module.eks.kubernetes_config_map_v1_data.aws_auth[0]'

I don't know what the implications of rm'ing this state has, is it safe to keep removing this state whenever we encounter this error?.

kaykhancheckpoint avatar Apr 25 '22 12:04 kaykhancheckpoint

a brand new cluster and tf state, eks 1.22

terraform {
  required_version = ">= 1.1.8"
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = ">= 4.9"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = ">= 2.10"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = ">= 1.13.1"
    }
  }
}

provider "aws" {
  alias  = "without_default_tags"
  region = var.aws_region
  assume_role {
    role_arn = var.assume_role_arn
  }
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}
locals {
  ## strips 'aws-reserved/sso.amazonaws.com/' from the AWSReservedSSO Role ARN
  aws_iam_roles_AWSReservedSSO_AdministratorAccess_role_arn_trim = replace(one(data.aws_iam_roles.AWSReservedSSO_AdministratorAccess_role.arns), "/[a-z]+-[a-z]+/([a-z]+(\\.[a-z]+)+)\\//", "")

  aws_auth_roles = concat([
    {
      rolearn  = data.aws_iam_role.terraform_role.arn
      username = "terraform"
      groups   = ["system:masters"]
    },
    {
      rolearn  = local.aws_iam_roles_AWSReservedSSO_AdministratorAccess_role_arn_trim
      username = "sre"
      groups   = ["system:masters"]
    }
  ],
    var.aws_auth_roles,
  )
}
  # aws-auth configmap
  create_aws_auth_configmap = var.self_managed_node_groups != [] ? true : null
  manage_aws_auth_configmap = true
  aws_auth_roles            = local.aws_auth_roles
  aws_auth_users            = var.aws_auth_users
  aws_auth_accounts         = var.aws_auth_accounts

leads to:

β”‚ Error: Unauthorized
β”‚
β”‚   with module.eks.module.eks.kubernetes_config_map.aws_auth[0],
β”‚   on .terraform/modules/eks.eks/main.tf line 414, in resource "kubernetes_config_map" "aws_auth":
β”‚  414: resource "kubernetes_config_map" "aws_auth" {

any ideas @bryantbiggs ? thanks in advance.

FernandoMiguel avatar Apr 26 '22 11:04 FernandoMiguel

@FernandoMiguel I'm seeing something similar in a configuration I'm working with. After some time of thought I believe you'll need to add the Assumed role to your configuration

provider "aws" {
  alias  = "without_default_tags"
  region = var.aws_region
  assume_role {
    role_arn = var.assume_role_arn
  }
}

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id,"--role", var.assume_role_arn]
  }
}

Sadly this isn't a solution for me. The configuration I'm working with uses dynamic credentials fed in.

Something along the lines...

provider "aws" {
  access_key = <access_key>
  secret_key = <secret_key>
  token = <token>
  region = <region>
}

This is useful if doing something where a temporary vm or container or tfe is running the terraform execution

Going down this route the provider is getting fed the information for connection and used entirely within the provider context (no aws config process was ever used).

The problem is none of that data is stored or carried over, so when the kubernetes provider tries to run the exec it's going to default to the methods the aws cli uses (meaning a locally store config in ~/.aws/config or ~/.aws/credentials). In my case that doesn't exist.

@FernandoMiguel it looks like your are presumably using a ~/.aws/config, so passing the assumed role and possibly the profile (if not using a default) should help move that forward. I cannot guarantee it will fix it, but that would be the theory.

mebays avatar Apr 29 '22 16:04 mebays

No config and no aws creds hardcoded. Everything is assume role from a global var. This works on hundreds of our projects.

FernandoMiguel avatar Apr 29 '22 17:04 FernandoMiguel

If you mean the cli exec, that's running from aws-vault exec --server

FernandoMiguel avatar Apr 29 '22 17:04 FernandoMiguel

@FernandoMiguel Hmm well that's interesting. I was able to get a solution to work for me.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws-iam-authenticator"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["token", "-i, module.eks.cluster_id]
  }
}

This seemed to work for me, but I also had to expose my endpoint to be public for the first run. Our network configuration was locked down too tightly for our remote execution server to hit the endpoint. That could be something else you make sure you are hitting.

If you mean the cli exec, that's running from aws-vault exec --server

What I meant was if credentials are being passed to the aws provider than I would necessarily see them being passed to the kubernetes provider. Some trouble shooting you could try it TF_LOG=debug terraform plan ... in order to get more information if you haven't tried that. If you really wanted to test if the kubernetes exec works spin up a vm or container pass the credentials and see if that carries over.

If my guess it correct than a way around it would be creating a ~/.aws/credentials file using a null resource and template out configuration that aws eks get-token can then reference.

The thought process I am having is the data being passed into the kubernetes provider contains no information about aws configuration. So I would expect it to fail if the instance running the terraform didn't have the aws cli configured.


Further thought if the remote execution tool being used doesn't have an ~/.aws/config but running inside an instance with an IAM role attached to it. Then it would default to that IAM role, so then it could still work as long as that IAM role has the ability to assume the role.

mebays avatar Apr 29 '22 19:04 mebays

@bryantbiggs I think the thought process I had from above just reassures your comment. I don't think there is anything in this module that can be done to fix this. I do have a suggestion of not completely remove the aws_auth_configmap_yaml output unless you have other solutions coming up. The reasoning is I could see a use case where terraform is ran to provision private cluster which may or may not be running on an instance that can reach that endpoint. If it isn't the aws_auth_configmap_yaml can be used in a completely separate process to hit the private cluster endpoint. It all depends on how separation of duties may come into play (a person to provision, and maybe a person to configure). It's just a thought.

mebays avatar Apr 29 '22 20:04 mebays

I would love to know what isn't working here. I spent a large chunk of this week trying every combo I could think to get this to work, without success. Different creds for the kube provider, different parallelism settings, recreating the code outside of the module so it would run after the eks cluster module had finished, etc.. I would always get either authentication error, that the config map didn't exist or that it couldn't create it. Very frustrating.

If we were to keep the now deprecated output, I can at least revert my internal PR and keep using that old and terrible null exec code to patch the config map.

FernandoMiguel avatar Apr 29 '22 20:04 FernandoMiguel

The problem might be the terraform-provider-kubernetes and not terraform-aws-eks, eg. https://github.com/hashicorp/terraform-provider-kubernetes/issues/1479, ... more about localhost connection refused. This one can really be difficult to catch.

tanvp112 avatar Apr 30 '22 08:04 tanvp112

@tanvp112 you are onto something there

we have this provider image notice the highlight bit that is not available until the cluster is up so it is possible that this provider is getting initialised with the wrong endpoint maybe even "localhost" and ofc that explains why auth fails explains why the 2nd apply works fine, cause now the endpoint is correct

FernandoMiguel avatar May 03 '22 11:05 FernandoMiguel

So my issue was with authentication, and I believe this example clearly states the issue.

The example state that you must set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Doing a little more digging and for those having issues with authentication could try something like this.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    # This would set up the aws cli configuration if there is no config or credential file running on the host  that would run the aws cli command
    env = {
        AWS_ACCESS_KEY_ID = var.access_key_id
        AWS_SECRET_ACCESS_KEY = var.secret_access_key
        AWS_SESSION_TOKEN = var.token
    } 
    # This requires the awscli to be installed locally where Terraform is executed\
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

I haven't gotten to try this myself, but it should work. The AWS_SESSION_TOKEN would only be needed for an assumed role process, but it could possibly work.

mebays avatar May 03 '22 19:05 mebays

So my issue was with authentication, and I believe this example clearly states the issue.

The example state that you must set AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY. Doing a little more digging and for those having issues with authentication could try something like this.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    # This would set up the aws cli configuration if there is no config or credential file running on the host  that would run the aws cli command
    env = {
        AWS_ACCESS_KEY_ID = var.access_key_id
        AWS_SECRET_ACCESS_KEY = var.secret_access_key
        AWS_SESSION_TOKEN = var.token
    } 
    # This requires the awscli to be installed locally where Terraform is executed\
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

I haven't gotten to try this myself, but it should work. The AWS_SESSION_TOKEN would only be needed for an assumed role process, but it could possibly work.

I honestly don't know what you are trying to do... aws iam auth can be done in many ways. not everyone has a dedicated IAM account... we use assume roles, for ex.

FernandoMiguel avatar May 04 '22 09:05 FernandoMiguel

I honestly don't know what you are trying to do... aws iam auth can be done in many ways. not everyone has a dedicated IAM account... we use assume roles, for ex.

When you assume a role your retrieve an temporary access key, secret key, and token. My code snippet is an example for when a user is running things in a jobbed off process inside of a container. Where the container contains no context for AWS (no config or credentials file). That is my use case where my runs are an isolated instance that does not persist (Terraform Cloud follows this same structure, but does not have aws installed by default), and run in a CICD pipeline fashion not on a local machine.

When the aws provider is used the configuration information is is passed into the provider for this example. (I'm making it simple. My context actually uses dynamic credential by using hashicorp vault, but don't want to introduce that complexity in this explanation.)

provider "aws" {
  region = "us-east-1"
  access_key = "<access key | passed via variable or some data query>"
  secret_key = "<secret access key | passed via variable or some data query>"
  token = "<session token | passed via variable or some data query>"
}

In this instance the AWS Provider has all information passed in and using the Provider Configuration method. On this run no local aws config file or environment variables exist, so it needs this to make any aws connection.

All aws resources create successfully in this process, besides that aws-auth configmap, when using the suggested example.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    # This requires the awscli to be installed locally where Terraform is executed\
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
}

The reason this is failing is the Kubernetes provider has no context on what you use for the aws command because no config or environment variables are being used. Therefore this will fail

  • NOTE: This will also fail if you have a local AWS Config loaded using a config file or environment variable that does not run as the same role as the EKS cluster was created. The only auth by default is the user or role that created the cluster. So if the local user cannot assume the role used with the above aws provider. The kubernetes commands will fail as well.

That is how the suggested route came to be.

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    # This would set up the aws cli configuration if there is no config or credential file running on the host  that would run the aws cli command
    env = {
        AWS_ACCESS_KEY_ID = "<same access key passed to aws provider | passed via variable or some data query>"
        AWS_SECRET_ACCESS_KEY = "<same secret access key passed to aws provider | passed via variable or some data query>"
        AWS_SESSION_TOKEN = "<same session token passed to aws provider | passed via variable or some data query>"
}
    } 
    # This requires the awscli to be installed locally where Terraform is executed\
    command     = "aws"
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id]
  }
}

In this provider block it is purposely passing in the required credential/configuration needed for the aws cli to successfully call aws eks get-token --cluster-name <cluster name>. Because the kubernetes provider does not care what was passed in to the aws provider. There is no shared context because there is no local configuration file or environment variables being leveraged.

@FernandoMiguel does this make sense on what I was trying to attain now? This may not be your use case, but it is useful information for anyone trying to run this module using some external remote execution tool.

I'm going to add this module does not contain the issue, but adding the above snippet to the documentation may help out those that may be purposely providing configuration to the aws provider vs utilizing Environment variables or local config files.

mebays avatar May 04 '22 12:05 mebays

In this provider block it is purposely passing in the required credential/configuration needed for the aws cli to successfully call aws eks get-token --cluster-name <cluster name>. Because the kubernetes provider does not care what was passed in to the aws provider. There is no shared context because there is no local configuration file or environment variables being leveraged.

@FernandoMiguel does this make sense on what I was trying to attain now? This may not be your use case, but it is useful information for anyone trying to run this module using some external remote execution tool.

it does. I've been fighting issued using the kube provider for weeks with what seems a race condition or failed to initialise endpoint/creds. Sadly, in our case, your snippet does not help since creds are already available via metadata endpoint. but it's a good idea to always double check if CLI tools are using the expected creds.

FernandoMiguel avatar May 04 '22 12:05 FernandoMiguel

I was having the same issue but the solution that worked for me is to configure the kubernetes provider to use the role, something like this:

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id, "--role", "arn:aws:iam::${AWS_ACCOUNT_ID}:role/${ROLE_NAME}" ]
  }
}

alfredo-gil avatar May 05 '22 06:05 alfredo-gil

I was having the same issue but the solution that worked for me is to configure the kubernetes provider to use the role, something like this:

provider "kubernetes" {
  host                   = module.eks.cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks.cluster_certificate_authority_data)

  exec {
    api_version = "client.authentication.k8s.io/v1alpha1"
    command     = "aws"
    # This requires the awscli to be installed locally where Terraform is executed
    args = ["eks", "get-token", "--cluster-name", module.eks.cluster_id, "--role", "arn:aws:iam::${AWS_ACCOUNT_ID}:role/${ROLE_NAME}" ]
  }
}

Ohh that's an interesting option... Need to try that

FernandoMiguel avatar May 05 '22 07:05 FernandoMiguel

I have the same issue, but like this: Post "http://localhost/api/v1/namespaces/kube-system/configmaps": dial tcp [::1]:80: connect: connection refused when i set "manage_aws_auth_configmap = true" when deploy eks managed group. Is there a decision how to solve it?

Epic55 avatar May 05 '22 09:05 Epic55