terraform-google-kubernetes-engine icon indicating copy to clipboard operation
terraform-google-kubernetes-engine copied to clipboard

Allow setting network tags on default node pool

Open tomasgareau opened this issue 3 years ago • 1 comments
trafficstars

I'm trying to bring up a safer-cluster in a Shared VPC network roughly based on the terraform-example-foundation/3-networks approach.

In this VPC network, access to private Google APIs is configured with:

  • DNS entries that point *.googleapis.com to private.googleapis.com (for resources without an external IP address, i.e., my safer-cluster nodes)
  • a firewall rule that allows egress to private.googleapis.com (i.e., 199.36.153.8/30) for any resources that have the network tag allow-google-apis

Like in the terraform-example-foundation-app, I've added the allow-google-apis network tag to the node_pools_tags variable.

What I'd expect to happen is:

  1. The google_container_cluster.primary resource gets created
  2. This creates a default node pool, which immediately gets deleted (as configured by the safer-cluster module)
  3. The node pool I configured now gets created (with the appropriate network tags)
  4. My cluster is ready to go

Instead, however, this seems to be happening:

  1. The google_container_cluster.primary resource gets created
  2. The default node pool gets created and tries to register itself with the control plane
  3. The default node pool cannot register itself with the control plane
  4. The create operation finally times out and the cluster errors out with "Error waiting for creating GKE cluster: All cluster resources were brought up, but: only 0 nodes out of 1 have registered"

In my firewall logs, I can see that the default pool is trying to reach private.googleapis.com (i.e., DNS is working as expected). However, since I cannot add the allow-google-apis network tag to this pool, this egress gets denied:

image

This may be related to https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/305. Is there a way to add network tags to this pool so it can finish creating successfully (and get immediately deleted, haha)? Or is there an alternative recommended approach here?

tomasgareau avatar Jan 07 '22 22:01 tomasgareau

The fix here would be to add the network tags (from the first pool) in the cluster config, like we do for service account.

Happy to review a PR.

morgante avatar Jan 11 '22 06:01 morgante