terraform-google-kubernetes-engine
terraform-google-kubernetes-engine copied to clipboard
Allow setting network tags on default node pool
I'm trying to bring up a safer-cluster in a Shared VPC network roughly based on the terraform-example-foundation/3-networks approach.
In this VPC network, access to private Google APIs is configured with:
- DNS entries that point
*.googleapis.comtoprivate.googleapis.com(for resources without an external IP address, i.e., mysafer-clusternodes) - a firewall rule that allows egress to private.googleapis.com (i.e., 199.36.153.8/30) for any resources that have the network tag
allow-google-apis
Like in the terraform-example-foundation-app, I've added the allow-google-apis network tag to the node_pools_tags variable.
What I'd expect to happen is:
- The
google_container_cluster.primaryresource gets created - This creates a default node pool, which immediately gets deleted (as configured by the
safer-clustermodule) - The node pool I configured now gets created (with the appropriate network tags)
- My cluster is ready to go
Instead, however, this seems to be happening:
- The
google_container_cluster.primaryresource gets created - The default node pool gets created and tries to register itself with the control plane
- The default node pool cannot register itself with the control plane
- The
createoperation finally times out and the cluster errors out with "Error waiting for creating GKE cluster: All cluster resources were brought up, but: only 0 nodes out of 1 have registered"
In my firewall logs, I can see that the default pool is trying to reach private.googleapis.com (i.e., DNS is working as expected). However, since I cannot add the allow-google-apis network tag to this pool, this egress gets denied:

This may be related to https://github.com/terraform-google-modules/terraform-google-kubernetes-engine/issues/305. Is there a way to add network tags to this pool so it can finish creating successfully (and get immediately deleted, haha)? Or is there an alternative recommended approach here?
The fix here would be to add the network tags (from the first pool) in the cluster config, like we do for service account.
Happy to review a PR.