terraform-google-kubernetes-engine icon indicating copy to clipboard operation
terraform-google-kubernetes-engine copied to clipboard

terraform apply throwing error on google module.

Open datacabinet opened this issue 3 years ago • 21 comments

I am using google terraform module version 3.44.0 with terraform version v1.0.5 .

I am getting an error : ╷ │ Error: Invalid for_each argument │ │ on .terraform/modules/gke/cluster.tf line 196, in resource "google_container_node_pool" "pools": │ 196: for_each = local.node_pools │ ├──────────────── │ │ local.node_pools is a map of map of string, known only after apply │ │ The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first apply only the resources that the for_each depends on. ╵

Can someone point out what may be going wrong?

datacabinet avatar Aug 28 '21 23:08 datacabinet

I was also seeing this error when I tried to create multiple node-pools using the beta-private-cluster module.

After some investigation, turns out it was because my node-pool definitions were referencing the output of another resource in the terraform plan. Specifically, the node-pools were referencing a service account email as below.

  node_pools = [
    {
      name = "main-pool"
      image_type = "COS_CONTAINERD"
      machine_type = var.client_cluster_machine_type
      service_account = google_service_account.cluster_default_sa.email
    },
    {
      name = var.fl_node_pool_name
      image_type = "COS_CONTAINERD"
      machine_type = var.client_cluster_machine_type
      service_account = google_service_account.cluster_flpool_sa.email
    }
]

What is a bit strange is that referencing the SA email works fine for a single node-pool, but not when there is 2. I haven't had a chance to dig into it properly.

I worked around it by constructing the SA email in the node-pool definition, rather than referencing the output. service_account = format("%s@%s.iam.gserviceaccount.com", local.cluster_sa_name, var.project_id)

jtangney avatar Sep 17 '21 10:09 jtangney

multiple-pools-error.tf.txt

The attached terraform reproduces the error. If you comment off one of the node pools, it works

jtangney avatar Sep 17 '21 11:09 jtangney

@jtangney Where is the var.fl_node_pool_name variable coming from? Can you try hard-coding it?

I'm not sure what is causing this error, since we don't use the service account value in the name of the node pool.

Note: if you just want to create a default SA for the cluster you can do that automatically with create_sa = true. You don't need to pass in your own SA.

morgante avatar Sep 17 '21 16:09 morgante

Hey Morgante. That variable is not the issue. I have tried hardcoding it, doesn't make any difference. The issue is the dependence on the output of other resource - the service account email in this case.

I want to have different service accounts per node pool. So creating a new service account, or setting the cluster-level service account, is unfortunately not sufficient.

I have experimented with various combinations. My conclusion so far is that if you have more than 1 node pool, you cannot reference the output of another resource (service account email in this instance). It works fine with exactly 1 node pool - but I wonder is this because there is some special logic around node-pools[0] (as it always assumes at least one pool)

You can reproduce it with a 'terraform plan' with that config. It will fail initially. Then comment off one of the pools, and it will then work. It will work also work with 2 pools if you hardcode the service account values (rather than reference then)

Hope that helps!

jtangney avatar Sep 17 '21 16:09 jtangney

What version of Terraform are you using? I wonder if this is somehow related to that.

morgante avatar Sep 17 '21 16:09 morgante

Terraform v1.0.6 on darwin_amd64

  • provider registry.terraform.io/hashicorp/external v2.1.0
  • provider registry.terraform.io/hashicorp/google v3.84.0
  • provider registry.terraform.io/hashicorp/google-beta v3.84.0
  • provider registry.terraform.io/hashicorp/kubernetes v2.5.0
  • provider registry.terraform.io/hashicorp/null v3.1.0
  • provider registry.terraform.io/hashicorp/random v3.1.0

jtangney avatar Sep 17 '21 16:09 jtangney

Thanks. Looking at our code, we're using for_each correctly (with the name as the key) so I'm not sure why it's raising an error.

Could you open an issue on Terraform core? https://github.com/hashicorp/terraform/issues

morgante avatar Sep 17 '21 16:09 morgante

Hmmm I'm not so sure it's a terraform core issue. Let me try do more investigation.

On Fri, 17 Sep 2021, 17:53 Morgante Pell, @.***> wrote:

Thanks. Looking at our code, we're using for_each correctly (with the name as the key) so I'm not sure why it's raising an error.

Could you open an issue on Terraform core? https://github.com/hashicorp/terraform/issues

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/991#issuecomment-921942039, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5U3RGBWKCYJBXNJ3D6S4LUCNXCDANCNFSM5C7SR47A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jtangney avatar Sep 17 '21 17:09 jtangney

I'm fairly sure it's a Terraform core issue. We're using the name (fixed) as the key for the for_each. The fact that the value is indeterminate shouldn't stop Terraform from looping over the list.

morgante avatar Sep 17 '21 17:09 morgante

Ok, you're the expert! Will raise

On Fri, 17 Sep 2021, 18:31 Morgante Pell, @.***> wrote:

I'm fairly sure it's a Terraform core issue. We're using the name (fixed) as the key for the for_each. The fact that the value is indeterminate shouldn't stop Terraform from looping over the list.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/991#issuecomment-921964181, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5U3REVND5GFJ4TEGUFFFLUCN3OVANCNFSM5C7SR47A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jtangney avatar Sep 17 '21 17:09 jtangney

If it helps, we upgraded terraform and had to do rewrite the yaml and things started working. Our code is smaller though and took a couple of days.

pankajkumar229 avatar Sep 17 '21 18:09 pankajkumar229

@jtangney - I'm having the same issue. Have you had a chance to raise a Jira again Terraform core? If so, what's the Jira # ?

alessio4tech avatar Nov 16 '21 11:11 alessio4tech

Sorry, I never raised a Jira against Terraform core. I just worked around it as described in my earlier comment

jtangney avatar Nov 16 '21 14:11 jtangney

Running into this issue as well.

dhodun avatar Jun 30 '22 13:06 dhodun

still an issue

mikejoseph-ah avatar Oct 19 '22 17:10 mikejoseph-ah

Still an issue...

skeenan947 avatar Oct 28 '22 19:10 skeenan947

Its a still an issue :(

In my case with external Service Account module.

My code for nodes:

node_pools = [ { name = "default-node-pool" machine_type = "custom-8-16384" node_locations = var.zone min_count = 1 max_count = 5 local_ssd_count = 0 disk_size_gb = 45 disk_type = "pd-standard" image_type = "COS" auto_repair = true auto_upgrade = true service_account = module.service_account_gke[count.index].email spot = true initial_node_count = 1 count = var.environment == "dev" ? 1 : 0 }, { name = "default-node-pool" machine_type = "custom-2-4096" min_count = 1 max_count = 5 local_ssd_count = 0 disk_size_gb = 15 disk_type = "pd-standard" image_type = "COS" auto_repair = true auto_upgrade = true service_account = module.service_account_gke[count.index].email spot = true initial_node_count = 1 count = var.environment == "dev" ? 0 : 1 } ]

mdzierzecki avatar Oct 30 '22 12:10 mdzierzecki

This is an issue for us as well since we need to create boot_disk_kms_key first and then reference it in the node_pool map.

node_pools = [
    {
      name               = "worker"
      min_count          = 1
      max_count          = 2
      auto_upgrade       = true
      node_metadata      = "GKE_METADATA"
      boot_disk_kms_key  = google_kms_crypto_key.bootdisk.id
      enable_secure_boot = true
      machine_type       = "n2-standard-8"
    },
    {
      name               = "worker"
      min_count          = 0
      max_count          = 1
      auto_upgrade       = true
      node_metadata      = "GKE_METADATA"
      boot_disk_kms_key  = google_kms_crypto_key.bootdisk.id
      enable_secure_boot = true
      machine_type       = "n2-standard-16"
    }
  ]

With 2 node pools it fails, but if we have just 1 it succeeds. Any workaround besides hardcoding the bootdisk id? Also is there any link to a terraform core issue here?

bsgrigorov avatar Apr 01 '23 04:04 bsgrigorov

Did someone file a bug? I run into the same issue and unclear the best way to resolve it.

diervo avatar Jan 30 '24 07:01 diervo

@bsgrigorov did you find a workaround for it?

pfuentealbat avatar Feb 21 '24 15:02 pfuentealbat

I think I had to run it with 1 node pool first and apply. Then I ran it with 2 or more on the second apply it works.

bsgrigorov avatar Feb 21 '24 15:02 bsgrigorov