gke-autoneg-controller icon indicating copy to clipboard operation
gke-autoneg-controller copied to clipboard

Ensure that autoneg handles multiple clusters

Open soellman opened this issue 3 years ago • 4 comments

Add tests that prove this case.

soellman avatar Aug 02 '21 14:08 soellman

Investigate rotating nodepools.

soellman avatar Aug 02 '21 14:08 soellman

I can say that it does "work" for multiple clusters as that is our use case.

I can't say that there are no issues related to multi-cluster usage, but we haven't run in to any problems in about a year of use.

This includes regularly destroying and recreating (for updates and configuration changes) cluster one at a time will observing no production impacts.

rwkarg avatar Aug 04 '21 23:08 rwkarg

That's great to hear that it's been working and stable!

I filed this issue to make sure that we have sufficient test coverage and documentation about the behavior, especially with the new multi-port PR just having been merged.

soellman avatar Aug 05 '21 07:08 soellman

I'm having issues when I deploy the autoneg controller using the terraform module into 2 GKE clusters in the same GCP project.

module "autoneg_transformation" {
  providers = {
    kubernetes = kubernetes.transformation
  }

  source                    = "github.com/GoogleCloudPlatform/gke-autoneg-controller//terraform/autoneg"
  controller_image   = "ghcr.io/googlecloudplatform/gke-autoneg-controller/gke-autoneg-controller:v0.9.6"
  project_id               = var.gcp_project
}

module "autoneg" {
  providers = {
    kubernetes = kubernetes.gke
  }

  source           = "github.com/GoogleCloudPlatform/gke-autoneg-controller//terraform/autoneg"
  controller_image = "ghcr.io/googlecloudplatform/gke-autoneg-controller/gke-autoneg-controller:v0.9.6"
  project_id       = var.gcp_project
}

that it uses the same service_account_id and custom role name autonegRegional during the deployment and get these 2 errors. I removed the actual gcp project id and replaced with some dummy project below

│ Error: Error creating service account: googleapi: Error 409: Service account autoneg already exists within project projects/data-sandbox.
│ Details:
│ [
│   {
│     "@type": "type.googleapis.com/google.rpc.ResourceInfo",
│     "resourceName": "projects/data-sandbox/serviceAccounts/[email protected]"
│   }
│ ]
│ , alreadyExists
│
│   with module.autoneg_transformation.module.gcp.google_service_account.autoneg[0],
│   on .terraform/modules/autoneg_transformation/terraform/gcp/main.tf line 25, in resource "google_service_account" "autoneg":
│   25: resource "google_service_account" "autoneg" {
│
╵
╷
│ Error: Custom project role projects/data-sandbox/roles/autonegRegional already exists and must be imported
│
│   with module.autoneg_transformation.module.gcp.google_project_iam_custom_role.autoneg,
│   on .terraform/modules/autoneg_transformation/terraform/gcp/main.tf line 57, in resource "google_project_iam_custom_role" "autoneg":
│   57: resource "google_project_iam_custom_role" "autoneg" {

We need to make the autoneg module configurable to pass variables to the gcp module. Would be great to address this. Also would be great to set the latest controller_image to the 0.9.6 instead of 0.9.2.

sreenivas-ps avatar Jul 14 '22 16:07 sreenivas-ps

I just have tested this and it seems to work fine.

rosmo avatar Mar 26 '24 10:03 rosmo