terraform-google-kubernetes-engine
terraform-google-kubernetes-engine copied to clipboard
Node Pools aren't being set w/ automatic node repair, node updates, or Autoscaling
My input:
module "gke-cluster" {
source = "google-terraform-modules/kubernetes-engine/google"
version = "1.19.1"
general = {
name = "${var.cluster_name}"
env = "${var.environment}"
zone = "${var.gcp_zone}"
}
master = {
enable_kubernetes_alpha = true
username = "admin"
password = "${random_string.password.result}"
}
default_node_pool = {
node_count = 3
machine_type = "${var.node_machine_type}"
disk_size_gb = "${var.node_disk_size}"
disk_type = "pd-ssd"
oauth_scopes = "https://www.googleapis.com/auth/compute,https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/logging.write,https://www.googleapis.com/auth/monitoring,https://www.googleapis.com/auth/servicecontrol,https://www.googleapis.com/auth/service.management,https://www.googleapis.com/auth/devstorage.read_only,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/monitoring.write,https://www.googleapis.com/auth/pubsub,https://www.googleapis.com/auth/datastore"
# autoscaling { ## I've tried with both this, and this commented out.
# min_node_count = 1
# max_node_count = 10
# }
# management { ## DEFAULTS to TRUE so it should just work, but it's not on 10/30pm
# auto_repair = true
# auto_upgrade= true
# }
}
node_pool = []
}
I get:

With things commented out looking in terraform.tfstate:
"google_container_cluster.new_container_cluster": {
"type": "google_container_cluster",
"depends_on": [
"data.google_container_engine_versions.region",
"local.name_prefix"
],
"primary": {
"id": "knative-dev-us-west1-c-master",
"attributes": {
.
.
.
"id": "knative-dev-us-west1-c-master",
.
.
.
"name": "knative-dev-us-west1-c-master",
.
.
.
"node_pool.#": "1",
"node_pool.0.autoscaling.#": "0",
"node_pool.0.initial_node_count": "3",
"node_pool.0.instance_group_urls.#": "1",
"node_pool.0.instance_group_urls.0": "https://www.googleapis.com/compute/v1/projects/lesv-008/zones/us-west1-c/instanceGroupManagers/gke-knative-dev-us-west1-default-pool-68956134-grp",
"node_pool.0.management.#": "1",
"node_pool.0.management.0.auto_repair": "false",
"node_pool.0.management.0.auto_upgrade": "false",
"node_pool.0.max_pods_per_node": "0",
"node_pool.0.name": "default-pool",
.
.
;
I would expect to either set it, or following the comments in the code get that as the default, will look again in the AM incase of operator error, as I'm very much a nube w/ terraform, GKE, and kNative. (Though I've built several clusters by hand)
I also tried just setting:
min_node_count = 1
max_node_count = 10
auto_repair = true
auto_upgrade= true
It failed inside default_node_pool, but worked inside a node_pool.
I tried just creating a single node_pool and commenting out default_node_pool, but that gave me two node pools, where the default had some really bad defaults.
So, I tried again, and still no success.
module "gke-cluster" {
source = "google-terraform-modules/kubernetes-engine/google"
version = "1.19.1"
general = {
name = "${var.cluster_name}"
env = "${var.environment}"
zone = "${var.gcp_zone}"
}
master = {
# enable_kubernetes_alpha = true # disables autoRepair & autoUpdate
username = "admin"
password = "${random_string.password.result}"
disable_kubernetes_dashboard = false
monitoring_service = "monitoring.googleapis.com"
maintenance_window = "02:15"
}
default_node_pool = {
node_count = 3
machine_type = "${var.node_machine_type}"
disk_size_gb = "${var.node_disk_size}"
disk_type = "pd-ssd"
oauth_scopes = "${join(",", var.scopes )}"
min_node_count = 1
max_node_count = 10
auto_repair = true
auto_upgrade= true
}
}
Currently there is no possibility to activate autoscaling or auto repair on the default node pool on the provider Google ...
Nothing in the doc: https://www.terraform.io/docs/providers/google/r/container_cluster.html#disk_size_gb
And nothing in the code: https://github.com/terraform-providers/terraform-provider-google/blob/51e63bfff2d2acba78bdbb35227669b820a4d61e/google/node_config.go
Personally I often delete the pool default but I think it should be an issue on the provider.
I can do it with the gcloud command. (I get the right result) when I do:
gcloud container clusters create $CLUSTER_NAME \
--zone=$CLUSTER_ZONE \
--cluster-version=latest \
--machine-type=n1-standard-4 \
--enable-autoscaling --min-nodes=1 --max-nodes=10 \
--enable-autorepair \
--scopes=service-control,service-management,compute-rw,storage-ro,cloud-platform,logging-write,monitoring-write,pubsub,datastore \
--num-nodes=3
Ah - I think I understand, we need to fix the go code.
I ended up switching to the beta provider and using resources directly (and that worked for me):
resource "google_container_cluster" "gke_cluster" {
name = "${var.cluster_name}"
zone = "${var.gcp_zone}"
min_master_version = "${var.master_version}"
master_auth {
username = "admin"
password = "${random_string.password.result}"
}
addons_config {
kubernetes_dashboard {
disabled = false
}
}
logging_service = "logging.googleapis.com/kubernetes"
monitoring_service = "monitoring.googleapis.com/kubernetes"
maintenance_policy {
daily_maintenance_window {
start_time = "02:10"
}
}
lifecycle {
ignore_changes = ["node_pool"]
}
node_pool {
name = "default-pool"
node_count = "${var.min_node_count}"
autoscaling {
min_node_count = "${var.min_node_count}"
max_node_count = "${var.max_node_count}"
}
management {
auto_upgrade = true
auto_repair = true
}
node_config {
oauth_scopes = "${var.scopes}"
machine_type = "${var.node_machine_type}"
disk_size_gb = "${var.node_disk_size}"
disk_type = "pd-ssd"
}
}
}
Thank you @lesv, I will look at this on the beta provider to see if I have not missed something on the stable version 👍
The standard provider seems to work fine for me with @lesv's solution.