terraform-provider-helm
terraform-provider-helm copied to clipboard
terraform apply crashed with "Kubernetes cluster unreachable" error.
Terraform, Provider, Kubernetes and Helm Versions
Terraform version: 1.0.9
Provider version: 2.4.1
Kubernetes version: 1.21
Affected Resource(s)
- helm_release
Terraform Configuration Files
provider "helm" {
alias = "helm_hamc"
kubernetes {
host = data.aws_eks_cluster.eks_cluster.endpoint
token = data.aws_eks_cluster_auth.eks_cluster_auth.token
cluster_ca_certificate = base64decode(data.aws_eks_cluster.eks_cluster.certificate_authority[0].data)
}
}
resource "helm_release" "test" {
provider = helm.helm_hamc
name = "test"
create_namespace = false
namespace = kubernetes_namespace.test.metadata[0].name
chart = "${path.module}/helm/chart/ftest-${var.chart_version}.tgz"
values = [
templatefile("${path.module}/values.yaml",
{
environment = var.customer
region = var.region
})
]
}
Steps to Reproduce
-
terraform apply
Expected Behavior
terraform apply to completed successfully
Actual Behavior
When I run terraform apply
, as part of that, some aws resources are installed before the helm_release is starting (there are dependencies) after 30 minutes of running the helm release starts and get the following error:
Kubernetes cluster unreachable: the server has asked for the client to provide credentials
With no change, on the second attempt, the terraform apply
is completed with no error.
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Hi @avnerv,
If I understands you right, you spin up a new Kubernetes cluster and apply a helm chart to it within the same TF code. If that so, then the observed behaviour is expected. In short, the root cause here is how Terraform initializes providers. It does it all in one shot at the very beginning. In this case, first run of the code does not have a valid Kubernetes configuration for the Helm provider and it fails. Once you run it second time, the cluster is up and running and Terraform can fetch configuration for the provider and apply will be successful.
In this case, you either need to separate cluster management and resource provisioning, or use Terraform -target
option to first spin up a cluster, and then apply resources.
This behaviour has mentioned in the Kubernetes provider's documentation here.
I hope that helps.
Thank you.
Hi @avnerv,
If I understands you right, you spin up a new Kubernetes cluster and apply a helm chart to it within the same TF code. If that so, then the observed behaviour is expected. In short, the root cause here is how Terraform initializes providers. It does it all in one shot at the very beginning. In this case, first run of the code does not have a valid Kubernetes configuration for the Helm provider and it fails. Once you run it second time, the cluster is up and running and Terraform can fetch configuration for the provider and apply will be successful.
In this case, you either need to separate cluster management and resource provisioning, or use Terraform
-target
option to first spin up a cluster, and then apply resources.This behaviour has mentioned in the Kubernetes provider's documentation here.
I hope that helps.
Thank you.
Thanks for the answer, but this is not the case.
I created the EKS cluster on another tf state that is not related to the deployment of the helm chart.
For example - I have two tf states, one for deploying all the infra resources (VPC, EKS, SG, IAM, etc.) and one for deploying my apps (helm chart).
So back to the original issue - when I run terraform apply
it' creates 80% of the related resources (namespaces, MongoDB, Redis. etc), and then after 30 minutes that the apply
is running I receive the error message on deploying the helm chart...
Thank you for the clarification. Then it looks like the token is getting expired due to its short-lived. In this case, you can try exec
plugin. Please, use api_version
: client.authentication.k8s.io/v1beta1
.
I hope that helps.
Thank you.
I got the same error, but I am already using the exec
plugin with aws eks get-token
.
My helm_release
failed mid way, leaving the release as failed
and making it difficult to retry a terraform apply
since I am now getting the error Error: cannot re-use a name that is still in use
.
I feel like there could be an issue with the provider.
Does the exec
plugin refresh the token for every request to the cluster endpoint?
I am facing a similar issue. It was working fine 6-7 days ago but suddenly stopped working and throwing this error since yesterday that the Cluster is unreachable.
I also tried completely removing the helm release but it still fails. The cluster is available and accessible but it's not able to connect.
Does this have to do something with the Kubernetes provider getting a recent release 2 days ago? link
I am facing a similar issue with helm provider version 2.9.0 with aws exec / cluster_ca_certificate and token approach but it works with kubeconfig way that not a best practices.
note: the same token and exec works fine with another providers like kubernetes with kubectl manifests resources.
Hey all! Just passed through this issue, with a little different log actually, and learned a problem that might be new and some of you might be experiencing while using EKS
and exec-plugin
. I'll be passing here since it's an already open issue about the subject, in case anyone needs.
Versions
- Terraform v1.5.6
- AWS Provider v3.76.1
- Helm Provider v2.11.0
I was upgrading my aws lab which was previously working, source code available here, and got the following error:
2023-09-10T12:35:38.677-0300 [ERROR] vertex "helm_release.argocd" error: Kubernetes cluster unreachable: Get "https://3EED65CB939BE8F433B62C22D9F7E2B0.gr7.us-east-1.eks.amazonaws.com/version": getting credentials: decoding stdout: couldn't get version/kind; json parse error: json: cannot unmarshal string into Go value of type struct { APIVersion string "json:\"apiVersion,omitempty\""; Kind string "json:\"kind,omitempty\"" }
Then I compared the command that I was using in provider and the configuration exported through aws eks update-kubeconfig
command:
users:
- name: arn:aws:eks:us-east-1:499237116720:cluster/gitops-eks
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- --region
- us-east-1
- eks
- get-token
- --cluster-name
- gitops-eks
- --output
- json
command: aws
env: null
interactiveMode: IfAvailable
provideClusterInfo: false
I immediately noticed that awscli
was adding --output json
arg. I ran the command aws eks get-token
with and without the output arg, here is output without it:
client.authentication.k8s.io/v1beta1 ExecCredential
STATUS 2023-09-10T17:19:05Z <token>
And with it:
{
"kind": "ExecCredential",
"apiVersion": "client.authentication.k8s.io/v1beta1",
"spec": {},
"status": {
"expirationTimestamp": "2023-09-10T17:19:11Z",
"token": "<token>"
}
}
The problem was solved to me by simply adding the output arg to my exec-plugin
configuration. It was a bit confusing at first since I didn't found any issue about it and I was using the suggested code in documentation present here. It seems that AWS changed the default output get-token
command at some point that I wasn't able to find it.
Maybe it would be interesting to add this arg on the documentation, so people don't face this problem while following the documentation:
provider "helm" {
kubernetes {
host = var.cluster_endpoint
cluster_ca_certificate = base64decode(var.cluster_ca_cert)
exec {
api_version = "client.authentication.k8s.io/v1beta1"
args = ["eks", "get-token", "--cluster-name", var.cluster_name, "--output", "json"]
command = "aws"
}
}
}
aws configure, set output to json
Does someone has a viable solution for this problem ? I am still getting this error for the helm provider.