terraform-provider-kubernetes kubernetes_manifest: Terraform often fails with "http2: server sent GOAWAY and closed the connection"

Terraform Version, Provider Version and Kubernetes Version

Terraform v1.3.2
on windows_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.11.0
+ provider registry.terraform.io/rancher/rancher2 v1.22.2

Affected Resource(s)

Terraform Configuration Files

provider.tf:

terraform {
  required_providers {
    rancher2 = {
      source  = "rancher/rancher2"
      version = "~>1.22.2"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~>2.11.0"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = "~>1.14.0"
    }
    helm = {
      source  = "hashicorp/helm"
      version = "~>2.7.1"
    }
  }

  backend "azurerm" {
    ....
  }
}

provider "rancher2" {
  api_url    = var.RANCHER_NOP_API_URL
  access_key = var.RANCHER_NOP_TOKEN
  secret_key = var.RANCHER_NOP_SECRET
}

provider "kubernetes" {
  host  = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
  token = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
}

provider "kubectl" {
  load_config_file = "false"
  host             = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
  token            = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
}

provider "helm" {
  kubernetes {
    host  = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
    token = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
  }
}

module/gatekeeper/gatekeeper.tf:

resource "kubernetes_manifest" "opa_config" {
  manifest = {
    apiVersion = "config.gatekeeper.sh/v1alpha1"
    kind = "Config"
    metadata = {
      name = "config"
      namespace = "cattle-gatekeeper-system"
      labels = {
          team = "skywalkers"
      }
    }
    spec = {
      match = [{
        excludedNamespaces = ["kube-*", "cattle-*"]
        processes = ["*"]
      }]
    }
  }
}

Debug Output

Panic Output

N/A

Steps to Reproduce

terraform plan

Expected Behavior

Plan succeeds without error

Actual Behavior

Plan fails with an error like this:

│   with module.gatekeeper.kubernetes_manifest.opa_config,
│   on .terraform\modules\gatekeeper\gatekeeper\main.tf line 1934, in resource "kubernetes_manifest" "opa_config":
│ 1934: resource "kubernetes_manifest" "opa_config" {
│
│ The plugin returned an unexpected error from plugin.(*GRPCProvider).UpgradeResourceState: rpc
│ error: code = Unknown desc = failed to determine resource type ID: cannot get OpenAPI foundry:
│ failed get OpenAPI spec: http2: server sent GOAWAY and closed the connection; LastStreamID=199,
│ ErrCode=NO_ERROR, debug=""

Important Factoids

N/A

References

476

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Dec 09 '22 11:12 papanito

This smells like authentication issues, but it's also the first time I've heard of that type of reply from the API server (GOAWAY) 😄

Need to look into potential causes for that error message.

Dec 14 '22 09:12 alexsomesan

Yeah not very friendly, at least a "please" would be nice 😄 Its pretty random, and after it occurred, a subsequent tf plan often succeeds

Dec 14 '22 10:12 papanito

Any update on this? I am facing this issue as well, but i keep getting the same error over and over again. A temporary fix seems to be to destroy and recreate the certificate or do a plan/apply with -refresh=false but these solutions are just temporary hacks

This are my versions:

Terraform v1.4.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/kubernetes v2.19.0

and resources

resource "kubernetes_manifest" "selfsigned-ca-issuer" {
  manifest = {
    apiVersion = "cert-manager.io/v1"
    kind       = "ClusterIssuer"
    metadata   = {
      name = "selfsigned-ca-issuer"
    }
    spec = {
      selfSigned = {}
    }
  }
}

resource "kubernetes_manifest" "selfsigned-star-certificate" {
  manifest = {
    apiVersion = "cert-manager.io/v1"
    kind       = "Certificate"
    metadata   = {
      name      = "selfsigned-star-certificate"
      namespace = "default"
    }
    spec = {
      commonName = "*.${var.base_hostname}"
      dnsNames   = [
        "*.${var.base_hostname}"
      ]
      secretName = "selfsigned-star-certificate"
      privateKey = {
        algorithm = "RSA"
        size      = 4096
      }
      issuerRef = {
        name  = kubernetes_manifest.selfsigned-ca-issuer.manifest.metadata.name
        kind  = "ClusterIssuer"
        group = "cert-manager.io"
      }
    }
  }
}

data "kubernetes_secret_v1" "star-certificate" {
  metadata {
    name      = kubernetes_manifest.selfsigned-star-certificate.manifest.spec.secretName
    namespace = kubernetes_manifest.selfsigned-star-certificate.manifest.metadata.namespace
  }
}

after terraform plan i keep getting

module.services.kubernetes_manifest.selfsigned-ca-issuer: Refreshing state...
module.services.kubernetes_manifest.selfsigned-star-certificate: Refreshing state...

Planning failed. Terraform encountered an error while generating this plan.

╷
│ Error: Plugin error
│ 
│   with module.services.kubernetes_manifest.selfsigned-star-certificate,
│   on services/certificates.tf line 14, in resource "kubernetes_manifest" "selfsigned-star-certificate":
│   14: resource "kubernetes_manifest" "selfsigned-star-certificate" {
│ 
│ The plugin returned an unexpected error from plugin.(*GRPCProvider).PlanResourceChange: rpc error: code = Unknown desc = failed to determine resource type ID: failed to look up GVK [cert-manager.io/v1, Kind=Certificate] among
│ available CRDs: unexpected error when reading response body. Please retry. Original error: http2: server sent GOAWAY and closed the connection; LastStreamID=199, ErrCode=NO_ERROR, debug=""

Apr 17 '23 15:04 santimar

This smells like authentication issues, but it's also the first time I've heard of that type of reply from the API server (GOAWAY) smile

Need to look into potential causes for that error message.

@alexsomesan After some investigation, it seems to be a feature of the api server that can be used when you have a load balancer and multiple control plane nodes.

As you can see here: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

One of the parameters is --goaway-chance float

To prevent HTTP/2 clients from getting stuck on a single apiserver, randomly close a connection (GOAWAY). The client's other in-flight requests won't be affected, and the client will reconnect, likely landing on a different apiserver after going through the load balancer again. This argument sets the fraction of requests that will be sent a GOAWAY. Clusters with single apiservers, or which don't use a load balancer, should NOT enable this. Min is 0 (off), Max is .02 (1/50 requests); .001 (1/1000) is a recommended starting point.

I only get this error on kubernetes_manifest resources though, so maybe it needs deeper investigation

Apr 27 '23 16:04 santimar

^ We're getting the same error but for other resources! Has there been a fix for this?

Nov 10 '23 21:11 aaj-synth

@aaj-synth I was able to fix this error by using multiple apiservers and putting a load-balancer in front of the cluster, but also the --goaway-chance 0 should work. I know it's not the fix you are looking for, but it works for now.

Nov 12 '23 11:11 santimar

terraform-provider-kubernetes terraform-provider-kubernetes copied to clipboard

kubernetes_manifest: Terraform often fails with "http2: server sent GOAWAY and closed the connection"

Terraform Version, Provider Version and Kubernetes Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Steps to Reproduce

Expected Behavior

Actual Behavior

Important Factoids

References

Community Note

terraform-provider-kubernetes
terraform-provider-kubernetes copied to clipboard