terraform-provider-postgresql icon indicating copy to clipboard operation
terraform-provider-postgresql copied to clipboard

"tuple concurrently updated" error on concurrent GRANT statements

Open Bhashit opened this issue 2 years ago • 20 comments

Terraform Version

1.1.1

Affected Resource(s)

  • postgresql_grant

Terraform Configuration Files


provider "postgresql" {
  host             = var.postgres_host
  port             = var.postgres_port
  username         = var.root_user_name
  password         = var.root_user_password
  expected_version = "12.3"
  superuser        = false
}

resource "postgresql_grant" "connect_db" {
  database    = postgresql_database.db.name
  object_type = "database"
  privileges  = ["CREATE", "CONNECT"]
  role        = postgresql_role.svc_admin.name
}

resource "postgresql_grant" "use_schema" {
  database    = postgresql_database.db.name
  object_type = "schema"
  privileges  = ["CREATE", "USAGE"]
  role        = postgresql_role.svc_admin.name
  schema      = "public"
}

Panic Output

╷
│ Error: could not execute revoke query: pq: tuple concurrently updated
│ 
│   with module.svc.postgresql_grant.use_schema,
│   on .terraform/modules/svc/main.tf line 118, in resource "postgresql_grant" "use_schema":
│  118: resource "postgresql_grant" "use_schema" {
│ 
╵

Expected Behavior

Multiple GRANT statements should get executed correctly.

Actual Behavior

terraform apply fails intermittently when multiple GRANT statements are involved.

Steps to Reproduce

  1. terraform apply with multiple grant statements. You can also try a large number of statements with a for_each to make it more likely that the error will happen.

Important Factoids

Found this threads on postgres/terraform mailing lists:

  1. https://www.postgresql.org/message-id/[email protected]
  2. https://www.postgresql.org/message-id/[email protected]
  3. https://discuss.hashicorp.com/t/for-each-support-sequential-operation/34680

The "solution" seems to be to run things sequentially. However, ideally, we should be able to handle this at the provider level. For ex. by either locking the table appropriately, or by retrying after a backoff period perhaps before failing.

One interesting thing that happened was that with my terraform apply, when TF exited, it didn't save the state. So, it created some resources, but they weren't tracked in the state. That could be a Terraform bug, but I thought I should at least mention it here.

Bhashit avatar Feb 02 '22 22:02 Bhashit

I think this is a similar problem to the issue/fix in #169

JamesTimms avatar Mar 03 '22 11:03 JamesTimms

I'm having the same issue, even after upgrading to 1.15 and configuring parallelism=0

fabiopaiva avatar Mar 14 '22 21:03 fabiopaiva

We were able to fix the issue by setting TF_CLI_ARGS_apply="-parallelism=1" (on provider version 1.15.0), but this certainly isn't ideal. I would love a fix as described above.

phclark avatar Mar 23 '22 15:03 phclark

same issue, using provider version 1.15.0

iskarbnik avatar Mar 27 '22 09:03 iskarbnik

We are experiencing the same problem even after upgrading to v1.15.0. We loop through a list of 3 items, create a Postgres user for each, and grant connect rights to those users. It fails consistently in the grant section.

postgresql_role.user["example"]: Creating...
postgresql_role.user["example_read"]: Creating...
postgresql_role.user["example_emergency"]: Creating...
postgresql_role.user["example_read"]: Creation complete after 1s [id=example_read]
postgresql_role.user["example"]: Creation complete after 1s [id=example]
postgresql_role.user["example_emergency"]: Creation complete after 1s [id=example_emergency]
postgresql_grant.user_db_connect_grants["example_emergency"]: Creating...
postgresql_grant.user_db_connect_grants["example"]: Creating...
postgresql_grant.user_db_connect_grants["example_read"]: Creating...
postgresql_grant.user_db_connect_grants["example"]: Creation complete after 1s [id=example_example_database]
postgresql_grant.user_db_connect_grants["example_read"]: Creation complete after 2s [id=example_read_example_database]
│ 
│ Error: could not execute revoke query: pq: tuple concurrently updated
│ 
│   with postgresql_grant.user_db_connect_grants["example_emergency"],
│   on resources.tf line 106, in resource "postgresql_grant" "user_db_connect_grants":
│  106: resource "postgresql_grant" "user_db_connect_grants" {

Terraform version 1.1.8

Update: Using the argument -parallelism=1 seems to have solved the problem.

ElMudo avatar Apr 19 '22 11:04 ElMudo

bump. I have many resources in additional to postgres grants in the same workspace. I dont want to have to set -parallelism=1

red8888 avatar May 02 '22 15:05 red8888

This really needs to be fixed, -parallelism=1 makes Terraform runs take hours....

philip-harvey avatar Jul 25 '22 20:07 philip-harvey

Would love to see this fixed. A retry approach for this specific error would be sweet!

marcneander avatar Sep 12 '22 15:09 marcneander

Using postgresql provider parameter max_connections = 1 seems to help as a workaround. But a real fix would be highly appreciated.

dpolivaev avatar Sep 13 '22 15:09 dpolivaev

The "fix" in #224 has broken the max_connections = 1 setting, so I'm having to pin to provider version 1.1.7

philip-harvey avatar Dec 22 '22 17:12 philip-harvey

still have this issue in 1.18

debu99 avatar Feb 17 '23 13:02 debu99

Any chance the change in #224 can be rolled back? In the past we could set max_connections = 1 and it would work, now it's just totally broken for versions > 1.1.7

philip-harvey avatar Apr 17 '23 16:04 philip-harvey

Hey everyone - I think I have a working fix in #352.

kylejohnson avatar Sep 20 '23 19:09 kylejohnson

As a workaround you can use -parallelism=1 in your terraform apply/destroy commands. It will run very slowly but it avoids this issue until a fix can be implemented. I've been using it with 1.21.0 for a while now with no issues.

reddragond avatar Oct 06 '23 14:10 reddragond

Hi everyone,

I went ahead and released a beta version which targets the tuple-concurrently-updated branch used in #352.

Using this beta release, the tuple concurrently updated error has been fixed 100% of the time in my internal usage at $dayjob.

I'd appreciate it if anyone else could test and confirm these results as well.

terraform {
  required_providers {
    postgresql = {
      source  = "cyrilgdn/postgresql"
      version = "1.21.1-beta.1"
    }
  }
}

kylejohnson avatar Nov 01 '23 19:11 kylejohnson

Oops, didn't mean to close.

kylejohnson avatar Nov 01 '23 21:11 kylejohnson

One datapoint: our internal TF module, which creates a DB and executes several GRANTs did just run perfectly on the first try with the version "1.21.1-beta.1" 👍

Previously, the first run was almost guaranteed to trigger the "tuple concurrently updated" condition. We always needed at least 2 (sometimes 3) attempts to run to completion.

Thanks a lot for fixing this!

rudl002 avatar Nov 28 '23 16:11 rudl002

Really appreciate the work👍

debu99 avatar Nov 29 '23 01:11 debu99

Hi. We still face this issue with 1.21.1-beta.1. Not only that, but when we try to consolidate postgresql_grant and postgresql_grant_role resources into 2 big ones (merging dicts), we get something different:

Error: could not get advisory lock for members of role ROLE_NAME: pq: deadlock detected
...
...
...
Error: pq: grant options cannot be granted back to your own grantor
...
resource "postgresql_grant" "all" {

joaocc avatar Dec 12 '23 04:12 joaocc

I'm also facing this issue in 1.22.0. In case it's relevant, I'm creating grants using a for_each (as below).

resource "postgresql_grant" "grant_all_on_tables" {
  for_each    = toset(var.full_access_users)

  database    = postgresql_database.main.name
  role        = each.value
  schema      = "public"
  object_type = "table"
  privileges  = ["DELETE","INSERT","REFERENCES","SELECT","TRIGGER","TRUNCATE","UPDATE"]
}

rmhw avatar Mar 27 '24 12:03 rmhw