terraform-google-kubernetes-engine
terraform-google-kubernetes-engine copied to clipboard
Private Cluster Config In modules/beta-private-cluster-update-variant Causes Cluster Recreation
TL;DR
The private_cluster_config.master_ipv4_cidr_block and private_cluster_config.private_endpoint_subnetwork values aren't recognised by Terraform after deploying a GKE cluster.
Expected behavior
I'd expect Terraform to be aware of all the resources created with the cluster resource so that when I re-run terraform apply, it doesn't see any changes.
Observed behavior
~ private_cluster_config {
- enable_private_endpoint = false -> null
+ master_ipv4_cidr_block = "10.0.0.0/28" # forces replacement
+ peering_name = (known after apply)
~ private_endpoint = "10.0.0.2" -> (known after apply)
- private_endpoint_subnetwork = "projects/<project>/regions/europe-west1/subnetworks/gke-<cluster_name>-<random_id>-pe-subnet" -> null # forces replacement
~ public_endpoint = "xx.xxx.xxx.xxx" -> (known after apply)
# (1 unchanged attribute hidden)
# (1 unchanged block hidden)
}
Terraform Configuration
module "shared_env_project_gke_clusters" {
for_each = local.shp.gke_clusters
source = "terraform-google-modules/kubernetes-engine/google//modules/beta-private-cluster-update-variant"
version = "~> 30.0"
project_id = local.shp.project_id
name = each.key
region = each.value.region
zones = data.google_compute_zones.shared_project_gke[each.key].names
network = each.value.vpc.name
subnetwork = each.value.vpc.subnet
master_authorized_networks = each.value.master_authorized_networks
master_ipv4_cidr_block = each.value.vpc.master_ipv4_cidr_block
ip_range_pods = "pods"
ip_range_services = "services"
release_channel = each.value.auto_upgrade == true ? "REGULAR" : "UNSPECIFIED"
add_cluster_firewall_rules = true
disable_default_snat = true
enable_private_endpoint = false
enable_private_nodes = true
create_service_account = false
service_account = each.value.service_account != null ? each.value.service_account == "shared_project_name-gke" ? replace(each.value.service_account, "shared_project_name", "${local.shp.project_name}") : each.value.service_account : "${local.shp.project_number}[email protected]"
remove_default_node_pool = true
# Addons
gce_pd_csi_driver = true
config_connector = true
enable_cost_allocation = true
cluster_resource_labels = {
component = "gke"
}
node_pools = [for k, v in each.value.nodepools :
{
name = k
machine_type = v.machine_type
node_locations = join(",", data.google_compute_zones.shared_project_gke[each.key].names)
initial_node_count = v.initial_node_count
min_count = null
max_count = null
total_min_count = v.total_min_count
total_max_count = v.total_max_count
local_ssd_count = v.local_ssd_count
disk_size_gb = v.disk_size_gb
disk_type = v.disk_type
image_type = "COS_CONTAINERD"
auto_repair = true
auto_upgrade = true
preemptible = v.preemptible
}
]
node_pools_oauth_scopes = {
all = [
"https://www.googleapis.com/auth/cloud-platform",
]
}
node_pools_labels = merge({
for k, v in each.value.nodepools :
k => {
(k) = true
}
},
{
all = {}
})
node_pools_taints = {
for k, v in each.value.nodepools :
k => [{
key = k
value = true
effect = "PREFER_NO_SCHEDULE"
}]
}
}
Terraform Version
Terraform v1.6.6
Additional information
I suspect this is happening because GKE is automatically created the subnet projects/
The only way I've been able to get around this is to fork the repo and add in the private_cluster_config dynamic block to the lifecycle block for the cluster resource:
lifecycle {
ignore_changes = [
node_pool,
initial_node_count,
resource_labels["asmv"],
private_cluster_config
]
}
I'm also using Terragrunt to managed the provider versions:
terraform {
required_version = "~> 1.6.0"
required_providers {
google = {
source = "hashicorp/google"
version = "~> 5.16.0"
}
google-beta = {
source = "hashicorp/google-beta"
version = "~> 5.16.0"
}
}
}
It appears that the appropriate fix for terraform is to disallow specifying master_ipv4_cidr_block and force using a pre-existing subnet in private_endpoint_subnetwork instead. Otherwise there doesn't seem to be any way to make the terraform behave appropriately given the new behavior of the backend provisioning.
I have the same issue and instead of forking the module, I added a sed (macOS) after the init command until this is fixed.
sed -i '' 's|ignore_changes = \[node_pool, initial_node_count, resource_labels\["asmv"\]\]|ignore_changes = [node_pool, initial_node_count, resource_labels["asmv"], private_cluster_config]|' ./.terraform/modules/<YOUR_MODULE_NAME>/modules/beta-private-cluster-update-variant/cluster.tf
Hi @braveokafor, @agaudreault, @TitanRob16
I have experienced the same issue. Is it enough to use the lifecycle/ignore_changes though? When I describe the cluster I see that it has a different privateClusterConfig.
First private cluster:
privateClusterConfig:
enablePrivateNodes: true
masterGlobalAccessConfig:
enabled: true
masterIpv4CidrBlock: 172.16.0.0/28
peeringName: gke-<REDACTED>-dfcf-2ee3-peer
privateEndpoint: 172.16.0.2
publicEndpoint: <REDACTED>
Second cluster:
privateClusterConfig:
enablePrivateNodes: true
masterGlobalAccessConfig:
enabled: true
privateEndpoint: 172.16.0.34
privateEndpointSubnetwork: projects/<REDACTED>/regions/europe-west1/subnetworks/gke-<REDACTED>-pe-subnet
publicEndpoint: <REDACTED>
You can see that the second one doesn't have a peeringName nor a masterIpv4CidrBlock.
Hi @Shaked,
I haven't experienced any issues thus far.
The masterIpv4CidrBlock from version 29.0 of the module is now the ipCidrRange in the auto-created privateEndpointSubnetwork in version 30.0.
# Cluster 1
$ gcloud container clusters describe <REDACTED> --location europe-west2 | yq '.privateClusterConfig'
enablePrivateEndpoint: true
enablePrivateNodes: true
masterGlobalAccessConfig:
enabled: true
masterIpv4CidrBlock: 10.1.0.0/28
peeringName: gke-<REDACTED>-peer
privateEndpoint: 10.1.0.2
publicEndpoint: <REDACTED>
# Cluster 2
$ gcloud container clusters describe <REDACTED> --location europe-west2 | yq -r '.privateClusterConfig'
enablePrivateNodes: true
masterGlobalAccessConfig:
enabled: true
privateEndpoint: 10.1.0.2
privateEndpointSubnetwork: projects/<REDACTED>/regions/europe-west2/subnetworks/gke-<REDACTED>-pe-subnet
publicEndpoint: <REDACTED>
# Subnet
$ gcloud compute networks subnets describe gke-<REDACTED>-pe-subnet --region europe-west2 | yq '.ipCidrRange'
10.1.0.0/28
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 7 days
Might be related to https://cloud.google.com/kubernetes-engine/docs/concepts/network-overview#control-plane
Control plane
In Kubernetes, the control plane manages the control plane processes, including the Kubernetes API server. How you access the control plane depends on the version of your GKE Autopilot or Standard cluster.
Clusters with Private Service Connect
Private or public clusters that meet any of the following conditions, use Private Service Connect to privately connect nodes and the control plane:
New public clusters in version 1.23 on or after March 15, 2022. New private clusters in version 1.29 after January 28, 2024. Existing public clusters that don't meet the preceding conditions are being migrated to Private Service Connect. Therefore, these clusters might already use Private Service Connect. To check if your cluster uses Private Service Connect, run the gcloud container clusters describe command. If your public cluster uses Private Service Connect, privateClusterConfig resource has the following values:
The
peeringNamefield is empty or doesn't exist. TheprivateEndpointfield has a value assigned.However, existing private clusters that don't meet the preceding conditions are not migrated yet.
You can create clusters that use Private Service Connect and change the cluster isolation.
Use authorized networks to restrict the access to your cluster's control plane by defining the origins that can reach the control plane.
Private Service Connect resources that are used for GKE clusters are hidden.
⚠️ Warning: Public clusters with Private Service Connect created before January 30, 2022 use a Private Service Connect endpoint and forwarding rule. Both resources are named gke-[cluster-name]-[cluster-hash:8]-[uuid:8]-pe and permit the control plane and nodes to privately connect. GKE creates these resources automatically with no cost. If you remove these resources, cluster network issues including downtime will occur.