terraform-provider-digitalocean
terraform-provider-digitalocean copied to clipboard
Terraform cannot find Digital Ocean cluster after cluster upgrade
Bug Report
Describe the bug
Digital Ocean kubernetes cluster resource is not found after an auto upgrade.
I had a cluster with version "1.21.2-do.1" with auto-upgrade and I'm using the cluster in the kubernetes provider.
After the cluster was auto-upgraded to "1.21.2-do.2", the kubernetes endpoint seems to have changed to localhost.
Given the terraform configuration file below, when I run terraform plan, it failed to get the namespace. Instead of using the kubernetes cluster endpoint, the request was made to localhost.
│ Error: Get "http://localhost/api/v1/namespaces/apps": dial tcp [::1]:80: connect: connection refused
The error is gone after I changed the version to the upgraded version "1.21.2-do.2".
Affected Resource(s)
No error when getting namespaces.
Actual Behavior
kubernetes_namespace.apps: Refreshing state... [id=apps]
╷
│ Error: Get "http://localhost/api/v1/namespaces/apps": dial tcp [::1]:80: connect: connection refused
│
│ with kubernetes_namespace.apps,
│ on main.tf line 36, in resource "kubernetes_namespace" "apps":
│ 36: resource "kubernetes_namespace" "apps" {
Steps to Reproduce
- Create a digital ocean cluster on a specific version with auto upgrade using terraform
- Let auto upgrade happen, or manually upgrade from Digital Ocean
- Run
terraform plan
Terraform Configuration Files
resource "digitalocean_kubernetes_cluster" "my_cluster" {
name = "my-cluster"
region = "sgp1"
version = "1.21.2-do.1"
auto_upgrade = true
node_pool {
name = "my_cluster_pool"
size = "s-1vcpu-2gb"
node_count = 3
}
}
provider "kubernetes" {
host = digitalocean_kubernetes_cluster.my_cluster.endpoint
token = digitalocean_kubernetes_cluster.my_cluster.kube_config[0].token
cluster_ca_certificate = base64decode(digitalocean_kubernetes_cluster.my_cluster.kube_config[0].cluster_ca_certificate)
}
resource "kubernetes_namespace" "apps" {
metadata {
name = "apps"
}
depends_on = [
digitalocean_kubernetes_cluster.my_cluster
]
}
Expected behavior
Terraform v1.0.5 on darwin_amd64
- provider registry.terraform.io/digitalocean/digitalocean v2.8.0
- provider registry.terraform.io/hashicorp/helm v2.1.2
- provider registry.terraform.io/hashicorp/kubernetes v2.2.0
Debug Output
Panic Output
Additional context
Important Factoids
References
It seems linked to https://github.com/hashicorp/terraform-provider-kubernetes/blob/main/kubernetes/provider.go#L265-L277 where kubernetes provider would connect to localhost when given an incomplete/invalid configuration.
I think in this case, when we specify a cluster version and enable auto upgrade, it is expected that the version would drift from the version specified in the terraform config following auto upgrades. In that case, the digital ocean provider should always find the same cluster.
Hi there,
Thanks for the write up. We will look into this.
this is kind of critical for any serious use. i think it's related to this behaviour where the cluster gets deleted on any change in the terraform resource. a version drift will cause the entire cluster to get brought down with an error
This is very likely related to what is discussed in https://github.com/digitalocean/terraform-provider-digitalocean/issues/562 The K8s provider block is evaluated before receiving updated credentials from the DOKS resource.
The Kubernetes provider docs have added a warning to discourage using interpolation (see https://github.com/hashicorp/terraform-provider-kubernetes/pull/1115).
When using interpolation to pass credentials to the Kubernetes provider from other resources, these resources SHOULD NOT be created in the same Terraform module where Kubernetes provider resources are also used. This will lead to intermittent and unpredictable errors which are hard to debug and diagnose. The root issue lies with the order in which Terraform itself evaluates the provider blocks vs. actual resources. Please refer to this section of Terraform docs for further explanation.
The most reliable way to configure the Kubernetes provider is to ensure that the cluster itself and the Kubernetes provider resources can be managed with separate apply operations.
We have a similar warning on the documentation for the DigitalOcean provider.
Applying the changes for just the cluster before the pieces using the K8s provider might help recover from this.
terraform apply -target=digitalocean_kubernetes_cluster.my_cluster
Hello @caalberts, I will be closing this issue. If you have any further questions please feel to re-open this ticket.