terraform-provider-kubernetes
terraform-provider-kubernetes copied to clipboard
kubernetes_manifest: Terraform often fails with "http2: server sent GOAWAY and closed the connection"
Terraform Version, Provider Version and Kubernetes Version
Terraform v1.3.2
on windows_amd64
+ provider registry.terraform.io/gavinbunney/kubectl v1.14.0
+ provider registry.terraform.io/hashicorp/helm v2.7.1
+ provider registry.terraform.io/hashicorp/kubernetes v2.11.0
+ provider registry.terraform.io/rancher/rancher2 v1.22.2
Affected Resource(s)
Terraform Configuration Files
provider.tf:
terraform {
required_providers {
rancher2 = {
source = "rancher/rancher2"
version = "~>1.22.2"
}
kubernetes = {
source = "hashicorp/kubernetes"
version = "~>2.11.0"
}
kubectl = {
source = "gavinbunney/kubectl"
version = "~>1.14.0"
}
helm = {
source = "hashicorp/helm"
version = "~>2.7.1"
}
}
backend "azurerm" {
....
}
}
provider "rancher2" {
api_url = var.RANCHER_NOP_API_URL
access_key = var.RANCHER_NOP_TOKEN
secret_key = var.RANCHER_NOP_SECRET
}
provider "kubernetes" {
host = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
token = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
}
provider "kubectl" {
load_config_file = "false"
host = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
token = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
}
provider "helm" {
kubernetes {
host = "${var.RANCHER_NOP_API_URL}/k8s/clusters/${rancher2_cluster.cluster.id}"
token = "${var.RANCHER_NOP_TOKEN}:${var.RANCHER_NOP_SECRET}"
}
}
module/gatekeeper/gatekeeper.tf:
resource "kubernetes_manifest" "opa_config" {
manifest = {
apiVersion = "config.gatekeeper.sh/v1alpha1"
kind = "Config"
metadata = {
name = "config"
namespace = "cattle-gatekeeper-system"
labels = {
team = "skywalkers"
}
}
spec = {
match = [{
excludedNamespaces = ["kube-*", "cattle-*"]
processes = ["*"]
}]
}
}
}
Debug Output
Panic Output
N/A
Steps to Reproduce
terraform plan
Expected Behavior
Plan succeeds without error
Actual Behavior
Plan fails with an error like this:
│ with module.gatekeeper.kubernetes_manifest.opa_config,
│ on .terraform\modules\gatekeeper\gatekeeper\main.tf line 1934, in resource "kubernetes_manifest" "opa_config":
│ 1934: resource "kubernetes_manifest" "opa_config" {
│
│ The plugin returned an unexpected error from plugin.(*GRPCProvider).UpgradeResourceState: rpc
│ error: code = Unknown desc = failed to determine resource type ID: cannot get OpenAPI foundry:
│ failed get OpenAPI spec: http2: server sent GOAWAY and closed the connection; LastStreamID=199,
│ ErrCode=NO_ERROR, debug=""
Important Factoids
N/A
References
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
This smells like authentication issues, but it's also the first time I've heard of that type of reply from the API server (GOAWAY) 😄
Need to look into potential causes for that error message.
Yeah not very friendly, at least a "please" would be nice 😄
Its pretty random, and after it occurred, a subsequent tf plan often succeeds
Any update on this?
I am facing this issue as well, but i keep getting the same error over and over again.
A temporary fix seems to be to destroy and recreate the certificate or do a plan/apply with -refresh=false but these solutions are just temporary hacks
This are my versions:
Terraform v1.4.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/kubernetes v2.19.0
and resources
resource "kubernetes_manifest" "selfsigned-ca-issuer" {
manifest = {
apiVersion = "cert-manager.io/v1"
kind = "ClusterIssuer"
metadata = {
name = "selfsigned-ca-issuer"
}
spec = {
selfSigned = {}
}
}
}
resource "kubernetes_manifest" "selfsigned-star-certificate" {
manifest = {
apiVersion = "cert-manager.io/v1"
kind = "Certificate"
metadata = {
name = "selfsigned-star-certificate"
namespace = "default"
}
spec = {
commonName = "*.${var.base_hostname}"
dnsNames = [
"*.${var.base_hostname}"
]
secretName = "selfsigned-star-certificate"
privateKey = {
algorithm = "RSA"
size = 4096
}
issuerRef = {
name = kubernetes_manifest.selfsigned-ca-issuer.manifest.metadata.name
kind = "ClusterIssuer"
group = "cert-manager.io"
}
}
}
}
data "kubernetes_secret_v1" "star-certificate" {
metadata {
name = kubernetes_manifest.selfsigned-star-certificate.manifest.spec.secretName
namespace = kubernetes_manifest.selfsigned-star-certificate.manifest.metadata.namespace
}
}
after terraform plan i keep getting
module.services.kubernetes_manifest.selfsigned-ca-issuer: Refreshing state...
module.services.kubernetes_manifest.selfsigned-star-certificate: Refreshing state...
Planning failed. Terraform encountered an error while generating this plan.
╷
│ Error: Plugin error
│
│ with module.services.kubernetes_manifest.selfsigned-star-certificate,
│ on services/certificates.tf line 14, in resource "kubernetes_manifest" "selfsigned-star-certificate":
│ 14: resource "kubernetes_manifest" "selfsigned-star-certificate" {
│
│ The plugin returned an unexpected error from plugin.(*GRPCProvider).PlanResourceChange: rpc error: code = Unknown desc = failed to determine resource type ID: failed to look up GVK [cert-manager.io/v1, Kind=Certificate] among
│ available CRDs: unexpected error when reading response body. Please retry. Original error: http2: server sent GOAWAY and closed the connection; LastStreamID=199, ErrCode=NO_ERROR, debug=""
This smells like authentication issues, but it's also the first time I've heard of that type of reply from the API server (GOAWAY) smile
Need to look into potential causes for that error message.
@alexsomesan After some investigation, it seems to be a feature of the api server that can be used when you have a load balancer and multiple control plane nodes.
As you can see here: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/
One of the parameters is --goaway-chance float
To prevent HTTP/2 clients from getting stuck on a single apiserver, randomly close a connection (GOAWAY). The client's other in-flight requests won't be affected, and the client will reconnect, likely landing on a different apiserver after going through the load balancer again. This argument sets the fraction of requests that will be sent a GOAWAY. Clusters with single apiservers, or which don't use a load balancer, should NOT enable this. Min is 0 (off), Max is .02 (1/50 requests); .001 (1/1000) is a recommended starting point.
I only get this error on kubernetes_manifest resources though, so maybe it needs deeper investigation
^ We're getting the same error but for other resources! Has there been a fix for this?
@aaj-synth I was able to fix this error by using multiple apiservers and putting a load-balancer in front of the cluster, but also the --goaway-chance 0 should work.
I know it's not the fix you are looking for, but it works for now.