terraform-provider-kubernetes icon indicating copy to clipboard operation
terraform-provider-kubernetes copied to clipboard

Unable to deploy a kubernetes manifest when relying on another resource

Open Quintasan opened this issue 2 years ago • 31 comments

Since https://github.com/hashicorp/terraform-provider-kubernetes-alpha was archived and we can no longer comment on https://github.com/hashicorp/terraform-provider-kubernetes-alpha/issues/123 then I'm just going to cross-post it to here so this doesn't get forgotten

Quintasan avatar Aug 23 '21 09:08 Quintasan

Can you clarify if you mean using a depends_on dependency as opposed to just referencing other resource for values?

txomon avatar Sep 14 '21 13:09 txomon

@Quintasan, same here with cert-manager helm chart and the kubernetes_manifest ClusterIssuer

dirien avatar Sep 28 '21 22:09 dirien

same thing with any CRD.

If you run your terraform code from scratch (with no resources existing yet) and you want your clusterIssuer to depend on the helm_release or null_resource that will install said CRD, that means you won't be able to plan or apply anything yet.

The point of using the kubernetes_manifest resource to me is to handle CRDs. it would be nice having a mechanism allowing the terraform to check the validity of the kubernetes_manifest but not the existence of the CRD if it relies on the application of another resource (certmanager) to install the CRDs. Even nicer would be to throw a warning in that case.

Right now the other possibility is to handle the deployment of a stack in two steps:

  1. install your CRDs
  2. install whatever runs on those CRDs Not ideal but it works.

Raclaw-jl avatar Sep 30 '21 12:09 Raclaw-jl

@Raclaw-jl, agree on you! This should be fixed, but Terraform had always problems with the CRD support... 😄

dirien avatar Sep 30 '21 14:09 dirien

This is a terraform limitation, not specific to kubernetes.

The limitation comes from not having all the data required at planning stage. Another example of this limitations would be planning new namespaces in a still-to-be-created k8s cluster.

Edit: The discussion previously shared didn't 100% match its scope with this known terraform limitation

txomon avatar Oct 01 '21 10:10 txomon

Hope it helps someone

Working hack is to use different modules for helm release and resources based on CRD

./modules/cert-manager/main.tf

resource "helm_release" "cert-manager" {
  name             = "cert-manager"
  repository       = "https://charts.jetstack.io"
  chart            = "cert-manager"
  namespace        = "ingress"
  create_namespace = true
  version          = "1.5.3"
  set {
    name  = "installCRDs"
    value = true
  }

  timeout = 150
}

./modules/certificates/main.tf

resource "kubernetes_manifest" "issuer" {
  manifest = {
    apiVersion = "cert-manager.io/v1"
    kind       = "ClusterIssuer"
   .... 
}    

main.tf

module "cert-manager" {
  source     = "./modules/cert-manager"
}

module "certificates" {
  depends_on = [module.cert-manager]
  source     = "./modules/certificates"
}

mo4islona avatar Oct 04 '21 14:10 mo4islona

Same for me. Cannot use depends_on for Kubernetes terraform resource. Awaiting for this feature.

Abhishekqwerty avatar Oct 12 '21 12:10 Abhishekqwerty

Also facing a similar issue, installing external-secrets-operator and then trying to setup a secret store CRD as part of the cluster bootstrapping.

In case it helps anyone, I ended up using a different workaround. You can wrap the CRD as a helm_release without creating your own chart. The idea is to leverage an existing chart like itscontained/raw which lets you define arbitrary YAML as part of the chart values:

# Instead of this ...
resource "kubernetes_manifest" "external_secrets_cluster_store" {
  depends_on = [helm_release.external_secrets]
  manifest = { ... }
}

# ... you can try using this
resource "helm_release" "external_secrets_cluster_store" {
  depends_on = [helm_release.external_secrets]
  name       = "external-secrets-cluster-store"
  repository = "https://charts.itscontained.io"
  chart      = "raw"
  version    = "0.2.5"
  values = [
    <<-EOF
    resources:
      - apiVersion: external-secrets.io/v1alpha1
        kind: ClusterSecretStore
        metadata:
          name: cluster-store
        spec:
          ... further contents of the ClusterSecretStore omitted ...
    EOF
  ]
}

DaniJG avatar Nov 05 '21 16:11 DaniJG

This issue needs to be fixed, but there is a workaround for those interested (mentioned here) : uses terraform-provider-kubectl which allow you to apply a yaml file without checking that type and apiVersion exist during plan stage

example from mentioned issue :

resource "helm_release" "cert_manager" {
  name       = "cert-manager"
  namespace  = "cert-manager"

  repository = "https://charts.jetstack.io"
  chart      = "cert-manager"
  version    = "v1.2.0"

  create_namespace = true

  values = [
    file("values/cert-manager.yaml")
  ]

  provisioner "local-exec" {
    command = "echo 'Waiting for cert-manager validating webhook to get its CA injected, so we can start to apply custom resources ...' && sleep 60"
  }
}

resource "kubectl_manifest" "cluster_issuer_letsencrypt_prod" {
  depends_on = [ helm_release.cert_manager ]
  yaml_body  = <<YAML
apiVersion: "cert-manager.io/v1"}
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    ...
YAML
}

DBarthe avatar Nov 12 '21 11:11 DBarthe

Hi all i'm facing a similar issue when using the depends on flag with the stated kubernetes provider where my code is as follows

provider "kubernetes" {
  host                   = var.kube_host
  client_certificate     = base64decode(var.kube_client_certificate)
  client_key             = base64decode(var.kube_client_key)
  cluster_ca_certificate = base64decode(var.kube_cluster_ca_cert)
}
resource "kubernetes_manifest" "letsencrypt_issuer_staging" {

  manifest = yamldecode(templatefile(
    "${path.module}/manifests/letsencrypt-issuer.tpl.yaml",
    {
      "name"                      = "letsencrypt-staging"
      "namespace"                 = kubernetes_namespace.cert.metadata[0].name
      "email"                     = var.cloudflareemail
      "server"                    = "https://acme-staging-v02.api.letsencrypt.org/directory"
      "api_token_secret_name"     = kubernetes_secret_v1.example.metadata[0].name #this will be gotten from theh azure vault
      "api_token_secret_data_key" = keys(kubernetes_secret_v1.example.data)[0]
    }
  ))

  depends_on = [helm_release.cert_manager]
}

resource "kubernetes_manifest" "letsencrypt_issuer_production" {

  manifest = yamldecode(templatefile(
    "${path.module}/manifests/letsencrypt-issuer.tpl.yaml",
    {
      "name"                      = "letsencrypt-prod"
      "namespace"                 = kubernetes_namespace.cert.metadata[0].name
      "email"                     = var.cloudflareemail
      "server"                    = "https://acme-v02.api.letsencrypt.org/directory"
      "api_token_secret_name"     = kubernetes_secret_v1.example.metadata[0].name #this will be gotten from theh azure vault
      "api_token_secret_data_key" = keys(kubernetes_secret_v1.example.data)[0]
    }
  ))

  depends_on = [helm_release.cert_manager]
}

This seems to result in

cannot create REST client: no client config

using terraform 1.1.3

whilst the docs do say that kubeconfig does need to present to use kubernetes_manifest I want to understand why this is as other resources which get deployed to the cluster such as a storage class or namespace do not require a kubeconfig, rather it seems to be derived by the values

  host                   = var.kube_host
  client_certificate     = base64decode(var.kube_client_certificate)
  client_key             = base64decode(var.kube_client_key)
  cluster_ca_certificate = base64decode(var.kube_cluster_ca_cert)

as a result, I'm a little bit baffled by the dependency on the kubeconfig also, I don't the provided config is compatible with kubectl_manifest resource any help is greatly appreciated

Doc links referenced: https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/manifest

dc232 avatar Jan 20 '22 09:01 dc232

Hi, I bumped into the same issue while injecting an SSH key into a kubernetes_manifest resource:

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "kubernetes_manifest" "jenkins_agent_sts" {
  depends_on = [
    tls_private_key.ssh_key
  ]
  manifest = yamldecode(templatefile(
    "${path.module}/manifests/jenkins_agent_sts.tpl.yaml",
    {
      "namespace"           = var.namespace
      "image"               = var.agent_image
      "tag"                 = var.agent_image_tag
      "ssh_public_key"      = tls_private_key.ssh_key.public_key_openssh
    }
  ))
}

The error I get:

Error: Failed to determine GroupVersionResource for manifest

  with kubernetes_manifest.jenkins_agent_sts,
 on terraform.tf line 146, in resource "kubernetes_manifest" "jenkins_agent_sts":
 146: resource "kubernetes_manifest" "jenkins_agent_sts" {

unmarshaling unknown values is not supported

emilianofs avatar May 05 '22 19:05 emilianofs

@emilianofs it looks like the error is coming as the template file is being decoded however the kubectl mainifest is being applied and this needs to be in yaml format so instead of

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "kubernetes_manifest" "jenkins_agent_sts" {
  depends_on = [
    tls_private_key.ssh_key
  ]
  manifest = yamldecode(templatefile(
    "${path.module}/manifests/jenkins_agent_sts.tpl.yaml",
    {
      "namespace"           = var.namespace
      "image"               = var.agent_image
      "tag"                 = var.agent_image_tag
      "ssh_public_key"      = tls_private_key.ssh_key.public_key_openssh
    }
  ))
}

try

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "kubernetes_manifest" "jenkins_agent_sts" {
  depends_on = [
    tls_private_key.ssh_key
  ]
  manifest = templatefile(
    "${path.module}/manifests/jenkins_agent_sts.tpl.yaml",
    {
      "namespace"           = var.namespace
      "image"               = var.agent_image
      "tag"                 = var.agent_image_tag
      "ssh_public_key"      = tls_private_key.ssh_key.public_key_openssh
    }
  )
}

The above should produce the result if not see below

you could also use the templatefile function in terraform see https://www.terraform.io/language/functions/templatefile for more details in terms of integration with the existing resource definition posed above

an alterative can be found below where the data is rendered in memory

resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}
data "template_file" "jenkins_agent"{
  template = file("${path.module}/manifests/jenkins_agent_sts.tpl.yaml")
  vars = {
      "namespace"                      = "${var.namespace}"
      "image"                 = var.agent_image
      "tag"                     =  var.agent_image_tag
      "ssh_public_key"     = tls_private_key.ssh_key.public_key_openssh
    }
}
resource "kubernetes_manifest" "jenkins_agent_sts" {
  depends_on = [
    tls_private_key.ssh_key
  ]
  manifest = data.template_file.jenkins_agent.rendered
    }
  ))
}

dc232 avatar May 05 '22 20:05 dc232

@dc232 Thanks for your reply. I have tried the two options and both of them return the same error: unmarshaling unknown values is not supported.

Removing yamldecode()

 Error: Failed to determine GroupVersionResource for manifest
 
   with kubernetes_manifest.jenkins_agent_sts,
   on terraform.tf line 163, in resource "kubernetes_manifest" "jenkins_agent_sts":
  163: resource "kubernetes_manifest" "jenkins_agent_sts" {
 
 unmarshaling unknown values is not supported

Using template_file resource:

 Error: Failed to determine GroupVersionResource for manifest
 
   with kubernetes_manifest.jenkins_agent_sts,
   on terraform.tf line 156, in resource "kubernetes_manifest" "jenkins_agent_sts":
  156: resource "kubernetes_manifest" "jenkins_agent_sts" {
 
 unmarshaling unknown values is not supported

The only way I can get it work is by defining the manifest in HCL instead of using the YAML template. There is a really useful tool to convert a YAML manifest to HCL k2tf: https://github.com/sl1pm4t/k2tf

emilianofs avatar May 05 '22 22:05 emilianofs

is this issue present in the roadmap ?

patsevanton avatar Aug 17 '22 14:08 patsevanton

This issue still exists

winston0410 avatar Aug 25 '22 20:08 winston0410

Can any of the recent reporters please provide an example that causes this issue?

alexsomesan avatar Aug 25 '22 20:08 alexsomesan

Can any of the recent reporters please provide an example that causes this issue?

Unfortunately, I have no example in my saved snippets, but it seems I still remember my thoughts why it happens.

If I right remember to reproduce this you need two resources in your terraform project:

  1. CRD definition (could be kubernetes_manifest or part of helm release)
  2. Kubernetes_manifest creating manifest of CRD type defined in previous point

If I right understand the problem is when terraform refreshes state it tries to query kubernetes to check if CRD item (2nd point) exists. But k8s returns an error, cause CRD itself wasn't yet created (so API for this CRD does not exist yet).

P.S. Sorry for not providing an actual reproduction snippet, but I have no cluster to reproduce this right now P.S.S. Just checked initial issue (it has reproduction snippet) https://github.com/hashicorp/terraform-provider-kubernetes-alpha/issues/123 and it seems that maybe I'm wrong understood this issue or maybe my case is just one of consequent problems

dm3ch avatar Aug 25 '22 21:08 dm3ch

@alexsomesan

This module would not work, if kubernetes_manifest is used instead of kubectl_manifest

terraform {
  required_providers {
    kubernetes = {
      source = "registry.terraform.io/hashicorp/kubernetes"
      version = "2.12.1"
    }
    helm = {
      source = "hashicorp/helm"
      version = "2.6.0"
    }
    kubectl = {
      source  = "gavinbunney/kubectl"
      version = "1.14.0"
    }
  }
}

variable "namespace" {
  type = string
  description = "k8s namespace used in this module"
}

variable "email" {
  type        = string
  description = "Email address that Let's Encrypt will use to send notifications about expiring certificates and account-related issues to."
  sensitive   = true
}

variable "api_token" {
  type        = string
  description = "API Token for Cloudflare"
  sensitive   = true
}

resource "helm_release" "cert_manager" {
  name       = "cert-manager"
  namespace = var.namespace
  repository = "https://charts.jetstack.io"
  chart      = "cert-manager"
  version = "1.9.1"
  
  set {
    name  = "installCRDs"
    value = "true"
  }
}

# Make the API Token a secret available globally
resource "kubernetes_secret_v1" "letsencrypt_cloudflare_api_token_secret" {
  metadata {
    name      = "letsencrypt-cloudflare-api-token-secret"
    namespace = var.namespace
  }

  data = {
    "api-token" = var.api_token
  }
}

resource "kubectl_manifest" "letsencrypt_issuer_staging" {
  yaml_body = templatefile(
    "${path.module}/letsencrypt-issuer.tpl.yaml",
    {
      "name"                      = "letsencrypt-staging"
      "email"                     = var.email
      "server"                    = "https://acme-staging-v02.api.letsencrypt.org/directory"
      "api_token_secret_name"     = kubernetes_secret_v1.letsencrypt_cloudflare_api_token_secret.metadata.0.name
      "api_token_secret_data_key" = keys(kubernetes_secret_v1.letsencrypt_cloudflare_api_token_secret.data).0
    }
  )

  depends_on = [
    # Need to install the CRDs first
    helm_release.cert_manager
  ]
}

resource "kubectl_manifest" "letsencrypt_issuer_production" {
  yaml_body = templatefile(
    "${path.module}/letsencrypt-issuer.tpl.yaml",
    {
      "name"                      = "letsencrypt-production"
      "email"                     = var.email
      "server"                    = "https://acme-v02.api.letsencrypt.org/directory"
      "api_token_secret_name"     = kubernetes_secret_v1.letsencrypt_cloudflare_api_token_secret.metadata.0.name
      "api_token_secret_data_key" = keys(kubernetes_secret_v1.letsencrypt_cloudflare_api_token_secret.data).0
    }
  )

  depends_on = [
    # Need to install the CRDs first
    helm_release.cert_manager
  ]
}

winston0410 avatar Aug 26 '22 09:08 winston0410

@winston0410 thanks a lot for sharing the example! I'll have a go a running it on my side, but one obvious thing already pops up. If the "cert_manager" helm_release resource installs CRDs then you cannot have CRs based on those CRDs be managed as manifests in the same apply operation. This is a known limitation of the provider and it has to do with having access to those CRDs' schemas at planning time (when they would in fact not be present if applied at the same time as the CR manifests). Workaround for that is to split the operations into two applies, where the first one installs the CRDs and anything else other than CRs and the second apply deploys the CRs.

alexsomesan avatar Sep 07 '22 14:09 alexsomesan

Are there any plans to resolve this issue?

I am running into the same issue as @alexsomesan but, unfortunately, don't have the option to run multiple operations. One of the main reasons for going with terraform for our k8s setup was having a single tool for cloud and cluster setup.

Blunderchips avatar Feb 11 '23 06:02 Blunderchips

@Blunderchips not being able to do multi stage applies is a problem that you will find in many cases, such as when you provision a cluster through google cloud and then want to install something on it. Terraform just can't compute the final state, and that's the main reason for multi stage applies. I am running a setup like the one you mention and it works wonders.

The reason why the kubectl command is running is because it's not running any kind of server side checks to make sure the plan is correct. A possible suggestion would be to optionally disable validation, however this is something that I doubt will be prioritized because the main limitation is not the provider but rather the fact that a multi stage apply is needed.

txomon avatar Feb 13 '23 17:02 txomon

Also facing a similar issue, installing external-secrets-operator and then trying to setup a secret store CRD as part of the cluster bootstrapping.

In case it helps anyone, I ended up using a different workaround. You can wrap the CRD as a helm_release without creating your own chart. The idea is to leverage an existing chart like itscontained/raw which lets you define arbitrary YAML as part of the chart values:

# Instead of this ...
resource "kubernetes_manifest" "external_secrets_cluster_store" {
  depends_on = [helm_release.external_secrets]
  manifest = { ... }
}

# ... you can try using this
resource "helm_release" "external_secrets_cluster_store" {
  depends_on = [helm_release.external_secrets]
  name       = "external-secrets-cluster-store"
  repository = "https://charts.itscontained.io"
  chart      = "raw"
  version    = "0.2.5"
  values = [
    <<-EOF
    resources:
      - apiVersion: external-secrets.io/v1alpha1
        kind: ClusterSecretStore
        metadata:
          name: cluster-store
        spec:
          ... further contents of the ClusterSecretStore omitted ...
    EOF
  ]
}

I tried to migrate to kubernetes_manifest after kubectl_manifest started to behave flaky and producing inconsistent results provisioning ClusterIssuer for cert-manager. This is the only workaround I could find without requiring a separate run context. The itscontained chart is no longer available, I replaced it by https://artifacthub.io/packages/helm/wikimedia/raw

robertobado avatar May 11 '23 17:05 robertobado

Regarding the suggestions in https://github.com/hashicorp/terraform-provider-kubernetes/issues/1380#issuecomment-962058148 and https://github.com/hashicorp/terraform-provider-kubernetes/issues/1380#issuecomment-967022975, @DaniJG posted a nice self-contained explanation on Medium at Avoid the Terraform kubernetes_manifest resource.

mloskot avatar Nov 27 '23 20:11 mloskot

Unfortunately the kubectl_manifest resource seems to be broken for Kubernetes 1.27+ https://github.com/gavinbunney/terraform-provider-kubectl/issues/270 leaving itscontained/raw as the only good solution right now.

sharebear avatar Nov 27 '23 20:11 sharebear

On Thu, 30 Nov 2023 at 12:52, Mina Farrokhnia @.***> wrote:

@robertobado https://github.com/robertobado You mentioned that you replace it with https://artifacthub.io/packages/helm/itscontained/raw as itscontained chart is not working, I am wondering which repository did you use?

I've also tested this and it did not work :

repository = "https://charts.itscontained.io" chart = "raw" version = "0.2.5"

I tried to find a new working repo from artifacthub https://artifacthub.io/packages/helm/itscontained/raw?modal=install link but by clicking on INSTALL, it shows me the same repo that I've tried earlier. Here is the error that I am facing with:

helm repo add itscontained https://charts.itscontained.io Error: looks like "https://charts.itscontained.io" is not a valid chart repository or cannot be reached: Get "https://charts.itscontained.io/index.yaml": tls: failed to verify certificate: x509: certificate has expired or is not yet valid: current time 2023-11-30T12:04:34+01:00 is after 2023-09-06T01:35:52Z

This currently works for me in

chart      = "raw"
repository = "https://helm-charts.wikimedia.org/stable/"
version    = "0.3.0"

mloskot avatar Nov 30 '23 12:11 mloskot

we're using this one from dysnix: https://artifacthub.io/packages/helm/dysnix/raw

moss2k13 avatar Nov 30 '23 12:11 moss2k13

Unfortunately the kubectl_manifest resource seems to be broken for Kubernetes 1.27+ gavinbunney/terraform-provider-kubectl#270 leaving itscontained/raw as the only good solution right now.

kubectl_manifests doesn't look maintained indeed. And as i've been trying out the helm_release provider, there are multiple a caveat to that solution:

When some apply fails, for some reason. Object gets tainted. Next plan/apply would re-create objects. That is: delete everything, then create everything. Plan doesn't show anything ... unless you enable some experiments / manifests feature at provider level ... which mostly shows you helm template output ... a good start, doesn't seem to validate objects against API, not aware of mutations, ... In general, even for a couple files, changing a single input, ... plan is so slooow ... and apply worse. single chart, 3 objects, apply takes 5 minutes. With kubernetes_manifests + tf templating, less than a second. Having applied a helm_release with terraform, try this: edit objects, on the cluster you manage, outside of terraform state. Then run a tf plan. "No changes"; Hooray ...

At which point, that helm provider is pretty much the worst thing i've ever used managing Kubernetes. And my company did write their own ungodly ansible-playbooks-wrapped-in-go terraform provider, ...

How come we don't have a single viable/feature-complete terraform provider managing Kubernetes?!

faust64 avatar Jan 10 '24 14:01 faust64

My issue is not related to having a CRD dependency but rather a simple variable interpolation within the yamldecode() encoded text in the manifest argument of kubernetes_manifest. It seems like the resource code for kubernetes_manifest needs to have some safeguards to properly handle terraform plans that rely on other terraform resources that haven't been created.

tonybaltazar avatar Apr 05 '24 00:04 tonybaltazar

As mentioned on https://github.com/hashicorp/terraform-provider-kubernetes/issues/1380#issuecomment-1119095354 It seems like removing yamldecode() completely and just HCL instead of YAML for the kubernetes manifest works perfectly fine in our case. This is a good workaround for issue that most people are having here.

tonybaltazar avatar Apr 16 '24 22:04 tonybaltazar