crane icon indicating copy to clipboard operation
crane copied to clipboard

I can use helm cli to deploy the crane to a GKE cluster but not via terraform

Open csharpknife2002 opened this issue 2 years ago • 2 comments

Describe the bug In the same GKE cluster, I can use helm cli to install crane successfuly via the following cmd:

helm install gocrane -n crane-test --create-namespace  crane/crane --set craned.containerArgs.prometheus-address=http://chen-prometheus-server.prometheus.svc.cluster.local:8080  --debug

But it always failed when I use terraform scripts, (I have no problem to deploy prometheus, granfana and other apps via TF to the same cluster


data "terraform_remote_state" "gke" {
  backend = "gcs" 
  config = {
    bucket  = "xxx-terraform-states"
    prefix  = "cluster"
  }
}

provider "google" {
  project = data.terraform_remote_state.gke.outputs.project_id
  region  = data.terraform_remote_state.gke.outputs.region
}

data "google_client_config" "default" {}

data "google_container_cluster" "my_cluster" {
  name     = data.terraform_remote_state.gke.outputs.kubernetes_cluster_name
  location = data.terraform_remote_state.gke.outputs.zone
  project = data.terraform_remote_state.gke.outputs.project_id
}

provider "helm" {
  kubernetes {
    host = data.terraform_remote_state.gke.outputs.kubernetes_cluster_host
    token = data.google_client_config.default.access_token
    cluster_ca_certificate = base64decode(data.google_container_cluster.my_cluster.master_auth[0].cluster_ca_certificate)
  }
  debug = true
}

provider "kubernetes" {
  host                   = "https://${data.terraform_remote_state.gke.outputs.kubernetes_cluster_host}"
  token                  = "${data.google_client_config.default.access_token}"
  cluster_ca_certificate = "${base64decode(data.google_container_cluster.my_cluster.master_auth.0.cluster_ca_certificate)}"
}

provider "kubectl" {
  load_config_file       = false
  host                   = "https://${data.terraform_remote_state.gke.outputs.kubernetes_cluster_host}"
  token                  = "${data.google_client_config.default.access_token}"
  cluster_ca_certificate = "${base64decode(data.google_container_cluster.my_cluster.master_auth.0.cluster_ca_certificate)}"
}

resource "kubernetes_namespace" "gocrane_ns" {
  metadata {
    name = "crane-system"
  }
}

resource "helm_release" "grafana-gocrane" {
  name  = "grafana-gocrane"
  repository = "https://grafana.github.io/helm-charts"
  chart = "grafana"

  timeout = 120
  cleanup_on_fail = true
  force_update    = false
  namespace       = kubernetes_namespace.gocrane_ns.metadata.0.name
  # version = "6.11.0"


  depends_on = [ kubernetes_namespace.gocrane_ns]

  values = [
    file("${path.module}/grafana_override_values.yaml")
  ]
}

resource "helm_release" "gocrane" {
  name  = "gocrane"
  repository = "https://gocrane.github.io/helm-charts"
  chart = "crane"

  timeout = 300
  cleanup_on_fail = true
  force_update    = false
  namespace       = kubernetes_namespace.gocrane_ns.metadata.0.name
  

  set {
    name = "craned.containerArgs.prometheus-address"
    value = "http://chen-prometheus-server.prometheus.svc.cluster.local:8080"
  }

  depends_on = [ helm_release.grafana-gocrane ]
}

resource "helm_release" "fadvisor" {
  name  = "fadvisor"
  repository = "https://gocrane.github.io/helm-charts"
  chart = "fadvisor"

  timeout = 120
  cleanup_on_fail = true
  force_update    = false
  namespace       = kubernetes_namespace.gocrane_ns.metadata.0.name

  set {
    name = "craned.containerArgs.prometheus-address"
    value = "http://chen-prometheus-server.prometheus.svc.cluster.local:8080"
  }

  depends_on = [helm_release.gocrane]
}



it always failed with the following logs, regardless how long I set the timeout. (with the cli, the installatio take a few seconds)

Warning: Helm release "gocrane" was created but has a failed status. Use the helm command to investigate the error, correct it, then run Terraform again.

│ 
│   with helm_release.gocrane,
│   on main.tf line 101, in resource "helm_release" "gocrane":
│  101: resource "helm_release" "gocrane" {
│ 
╵
╷
│ Error: timed out waiting for the condition
│ 
│   with helm_release.gocrane,
│   on main.tf line 101, in resource "helm_release" "gocrane":
│  101: resource "helm_release" "gocrane" 

Reproduce steps

Expected behavior

Screenshots image

Environment (please complete the following information):

  • K8S Version: [e.g. 1.19] Server Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.8-gke.500", GitCommit:"f117e29cb87cfb7e1de32ab4e163fb01ac5d0af9", GitTreeState:"clean", BuildDate:"2023-03-23T10:22:38Z", GoVersion:"go1.19.7 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}
  • Crane Version: [e.g. 0.1.0]
  • 0.10.0
  • Browser [e.g. chrome, safari]

csharpknife2002 avatar May 25 '23 04:05 csharpknife2002

Is the chart download blocked by network?

qmhu avatar May 26 '23 02:05 qmhu

@qmhu , no, the download was success. Actually I didn't see any differences between use helm cli and terraform. I got all the things as expected, Except terraform will fail after a while, which I have not clue what it is failing for... (or what it is waiting for...)

clin4 avatar Jun 14 '23 21:06 clin4