terraform-provider-google
terraform-provider-google copied to clipboard
Terraform google_project_iam_binding deletes GCP compute engine default service account from IAM principals
Update
See Usability improvements for *_iam_policy and *_iam_binding resources #8354
google_project_iam_binding
resource is Authoritative which mean it will delete any binding that is NOT explicitly specified in the terraform configuration.
Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved.
Not sure who can get the clear idea what terraform does with google_project_iam_binding
but as GCP has identified, Terraform google_project_iam_binding has deleted all the accounts not in the members attribute that have "roles/Editor" role.
Still, I believe this is a terraform defect.
As per the Google APIs Service Agent document, it is the essential service accounts that GCP internally manages. Terraform should not delete any such GCP managed internal service accounts as it bring the GCP projects down. I doubt in what use cases do we need this to happen.
Please, instead of of the assertin "work as designed", do not delete the GCP managed internal service accounts, as they are essential to make the GCP project work.
Original issue raised
Terraform google_project_iam_binding deletes GCP compute engine default service account from IAM principals has the detailed step-by-step reproduction steps and snapshots.
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request.
- Please do not leave +1 or me too comments, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
- If an issue is assigned to the
modular-magician
user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned tohashibot
, a community member has claimed the issue already.
Terraform Version
$ terraform version
Terraform v1.0.4
on linux_amd64
+ provider registry.terraform.io/hashicorp/google v4.6.0
Affected Resource(s)
GCP IAM Compute Engine default service account. It gets deleted by Terraform and cannot manage Compute Engine, hence GKE nodes as well.
- google_project_iam_binding
Terraform Configuration Files
After further investigation, "roles/Editor" is sufficient to reproduce the issue.
variable "PROJECT_ID" {
type = string
description = "GCP Project ID"
default = "test-tf-sa"
}
variable "REGION" {
type = string
description = "GCP Region"
default = "us-central1"
}
variable "roles_to_grant_to_service_account" {
description = "IAM roles to grant to the service account"
type = list(string)
default = [
"roles/editor", # <------------------------------ Only including role/Editor will reproduce the issue
"roles/iam.serviceAccountAdmin",
"roles/resourcemanager.projectIamAdmin"
]
}
provider "google" {
project = var.PROJECT_ID
region = var.REGION
}
resource "google_service_account" "terraform" {
account_id = "terraform"
display_name = "terraform service account"
}
resource "google_project_iam_binding" "terraform" {
project = var.PROJECT_ID
#--------------------------------------------------------------------------------
# Grant the service account to have the roles
#--------------------------------------------------------------------------------
members = [
"serviceAccount:${google_service_account.terraform.email}"
]
for_each = toset(var.roles_to_grant_to_service_account)
role = each.value
}
$ terraform apply --auto-approve
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# google_project_iam_binding.terraform["roles/editor"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/editor"
}
# google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/iam.serviceAccountAdmin"
}
# google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"] will be created
+ resource "google_project_iam_binding" "terraform" {
+ etag = (known after apply)
+ id = (known after apply)
+ members = (known after apply)
+ project = "test-tf-sa"
+ role = "roles/resourcemanager.projectIamAdmin"
}
# google_service_account.terraform will be created
+ resource "google_service_account" "terraform" {
+ account_id = "terraform"
+ disabled = false
+ display_name = "terraform service account"
+ email = (known after apply)
+ id = (known after apply)
+ name = (known after apply)
+ project = (known after apply)
+ unique_id = (known after apply)
}
Plan: 4 to add, 0 to change, 0 to destroy.
google_service_account.terraform: Creating...
google_service_account.terraform: Creation complete after 2s [id=projects/test-tf-sa/serviceAccounts/[email protected]]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creating...
google_project_iam_binding.terraform["roles/editor"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creating...
google_project_iam_binding.terraform["roles/iam.serviceAccountAdmin"]: Creation complete after 9s [id=test-tf-sa/roles/iam.serviceAccountAdmin]
google_project_iam_binding.terraform["roles/editor"]: Creation complete after 9s [id=test-tf-sa/roles/editor]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Still creating... [10s elapsed]
google_project_iam_binding.terraform["roles/resourcemanager.projectIamAdmin"]: Creation complete after 10s [id=test-tf-sa/roles/resourcemanager.projectIamAdmin]
Apply complete! Resources: 4 added, 0 changed, 0 destroyed.
Debug Output
Panic Output
Expected Behavior
Terraform will not remove the GCP Compute Engine Default service account from the IAM principals.
Actual Behavior
Before running the script, the Compute Engine default account exists in the IAM principals (with Compute Engine API enabled).
data:image/s3,"s3://crabby-images/fcfc6/fcfc6d2d4e6668ccf9d91edccbf4bcf8e50032a0" alt=""
After running the terraform script. The GCP Compute Engine default service account get deleted by the script.
gcloud projects get-iam-policy
command does not show the Compute Engine default service account [email protected], either.
$ GCP_PROJECT_ID=test-tf-sa
$ gcloud projects get-iam-policy $GCP_PROJECT_ID
bindings:
- members:
- serviceAccount:[email protected]
role: roles/compute.admin
- members:
- serviceAccount:[email protected]
role: roles/compute.instanceAdmin
- members:
- serviceAccount:[email protected]
role: roles/compute.serviceAgent
- members:
- serviceAccount:service-1079157603081@container-engine-robot.iam.gserviceaccount.com
role: roles/container.serviceAgent
- members:
- serviceAccount:[email protected]
role: roles/containerregistry.ServiceAgent
- members:
- serviceAccount:[email protected]
role: roles/editor
- members:
- user:****@gmail.com
role: roles/owner
- members:
- serviceAccount:[email protected]
role: roles/pubsub.serviceAgent
etag: BwXVf2S5fCQ=
version: 1
Because of this, GKE cluster cannot be deleted, created because Compute Engine permissions have gone.
$ gcloud container clusters delete cluster-1 --zone=us-central1-c
The following clusters will be deleted.
- [cluster-1] in [us-central1-c]
Do you want to continue (Y/n)? Y
Deleting cluster cluster-1...done.
ERROR: (gcloud.container.clusters.delete) Some requests did not succeed:
- args: ['Operation [<Operation\n clusterConditions: [<StatusCondition\n canonicalCode: CanonicalCodeValueValuesEnum(PERMISSION_DENIED, 7)\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">]\n detail: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n endTime: \'2022-01-14T00:20:54.190004708Z\'\n error: <Status\n code: 7\n details: []\n message: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.">\n name: \'operation-1642119632548-20038ec5\'\n nodepoolConditions: []\n operationType: OperationTypeValueValuesEnum(DELETE_CLUSTER, 2)\n selfLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/operations/operation-1642119632548-20038ec5\'\n startTime: \'2022-01-14T00:20:32.548792723Z\'\n status: StatusValueValuesEnum(DONE, 3)\n statusMessage: "Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'."\n targetLink: \'https://container.googleapis.com/v1/projects/1079157603081/zones/us-central1-c/clusters/cluster-1\'\n zone: \'us-central1-c\'>] finished with error: Google Compute Engine: Required \'compute.instanceGroups.update\' permission for \'projects/1079157603081/zones/us-central1-c/instanceGroups/gke-cluster-1-default-pool-b54fa6be-grp\'.']
exit_code: 1
Google Compute Engine: Not all instances running in IGM after 18.798524988s. Expected 3, running 0, transitioning 3. Current errors: [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.instances.create' permission for 'projects/1079157603081/zones/us-central1-c/instances/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.create' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.disks.setLabels' permission for 'projects/1079157603081/zones/us-central1-c/disks/gke-cluster-2-default-pool-36522bb7-0vkl' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.use' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]'); [PERMISSIONS_ERROR]: Instance 'gke-cluster-2-default-pool-36522bb7-0vkl' creation failed: Required 'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]') (truncated).
Steps to Reproduce
Please see * Terraform google_project_iam_binding deletes GCP compute engine default service account from IAM principals
- Enable the Compute Engine API in the GCP project.
- Verify the GCP Compute Engine Default service account exists in the IAM console view.
-
terraform apply
- Verify the GCP Compute Engine default service account has gone in IAM principals menu although it still remains in the IAM Service Accounts menu.
Now the GCP Compute Engine default service account was compromised and cannot manage Compute Engines and GKE nodes.
Important Factoids
No
References
Impact
GKE cannot be created anymore after the GCP Compute Engine Default Service Account disappeared in the IAM console. Need to create another project to be able to create GKE.
Cause
GCP identified that Terraform has deleted the Google APIs Service Agent which is Google-managed service accounts.
Some Google Cloud services need access to your resources so that they can act on your behalf. For example, when you use Cloud Run to run a container, the service needs access to any Pub/Sub topics that can trigger the container.
To meet this need, Google creates and manages service accounts for many Google Cloud services. These service accounts are known as Google-managed service accounts. You might see Google-managed service accounts in your project's IAM policy, in audit logs, or on the IAM page in the Cloud Console.
Google-managed service accounts are not listed in the Service accounts page in the Cloud Console.
Google APIs Service Agent. Your project is likely to contain a service account named the Google APIs Service Agent, with an email address that uses the following format: [email protected]
This service account runs internal Google processes on your behalf. It is automatically granted the Editor role (roles/editor) on the project.
Terraform should not delete any such GCP managed internal service account essential to run GCP services, hence I regard this is a Terraform bug.
Fix
According to GCP:
To fix this issue you can add the service agent in the IAM page using the Add option at the top. The principal will be "${PROJECT_ID}@cloudservices.gserviceaccount.com" and add the editor role.
As per the error message, add '[email protected]' in IAM.
'compute.subnetworks.useExternalIp' permission for 'projects/1079157603081/regions/us-central1/subnetworks/default' (when acting as '[email protected]') (truncated).
The Google APIs Service Agent is restored in the view.
Create GKE.
b/304725229
Just to add as a workaround using terraform I have been using the following after creating a project.
data "google_iam_policy" "editor" {
binding {
members = [
"serviceAccount:${google_project.project.number}@cloudservices.gserviceaccount.com",
# "serviceAccount:${google_project.project.number}[email protected]",
]
role = "roles/editor"
}
}
resource "google_project_iam_policy" "add" {
policy_data = data.google_iam_policy.editor.policy_data
project = google_project.project.project_id
}
This will fix the issue preventing the GKE cluster from being removed.
This will still remove any permissions not tracked by Terraform. Including the users which created the project which would generally have the owner role.
However, the issue then is if you add any additional policies later that arent all tracked one big policy will override the one which was previously created. The same is true for I am binding you can get in a loop where one will override the other and each terraform apply
will always delete already applied values.
The only way you can get it so that it won't override is if all changes to policies are applied using one google_project_iam_policy or google_project_iam_binding per project.
Ran into this myself, seems like we have to use "google_project_iam_member" instead. This will add a role to a member, without removing the other members from the role you are assigning.
-
google_project_iam_binding Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved. -> Applies the role to the list of members, retaining the other roles for those members, removing the roles from all the members not in that list. (And for some reason also removing the iam user (member), maybe if they only had one role, that is now removed?)
-
google_project_iam_member Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the project are preserved. -> applies the role to a member, retaining the roles of that member, retaining all members having that role.
Tested with:
Terraform v1.2.9
terraform/provider hashicorp/google:4.38.0
problem persists.
This is a feature of the provider and the project_iam_binding module. In authoritative mode, it removes any additions to the specified role. You just have to add the "Google-managed service accounts" to your terraform specification for those roles.
bindings = {
# We need it because of this.
# https://cloud.google.com/iam/docs/service-accounts#google-managed
"roles/editor" = [
"serviceAccount:${module.project-factory.project_number}[email protected]",
"serviceAccount:${module.project-factory.project_number}@cloudservices.gserviceaccount.com"
]
# ...
}
Removing iam-serviceaccount
team because I don't think this is an issue with the google_service_account
resource.
Updated the description to prevent our bot from continuing to add the label
Ran into this myself, seems like we have to use "google_project_iam_member" instead. This will add a role to a member, without removing the other members from the role you are assigning.
- google_project_iam_binding Authoritative for a given role. Updates the IAM policy to grant a role to a list of members. Other roles within the IAM policy for the project are preserved. -> Applies the role to the list of members, retaining the other roles for those members, removing the roles from all the members not in that list. (And for some reason also removing the iam user (member), maybe if they only had one role, that is now removed?)
- google_project_iam_member Non-authoritative. Updates the IAM policy to grant a role to a new member. Other members for the role for the project are preserved. -> applies the role to a member, retaining the roles of that member, retaining all members having that role.
If anyone using pulumi
my service accounts roles where deleted I believe due to using new gcp.projects.IAMBinding
I changed it to:
new gcp.projects.IAMMember(`foo`, {
project: gcpServiceAccountProject || gcpProject,
role,
member: pulumi.interpolate`serviceAccount:${gcpServiceAccount.email}`,
})
Also had to delete the original service accounts and then recreate them.