terraform-provider-postgresql icon indicating copy to clipboard operation
terraform-provider-postgresql copied to clipboard

Deadlock on state refresh with multiple grants for single role

Open mochja opened this issue 3 years ago • 7 comments

Terraform Version

Terraform v1.0.4
on darwin_amd64
+ provider registry.terraform.io/cyrilgdn/postgresql v1.14.0
+ provider registry.terraform.io/hashicorp/random v3.1.0

Affected Resource(s)

  • postgresql_grant

Terraform Configuration Files

terraform {
  required_version = "~>1.0.0"
  required_providers {
    postgresql = {
      source = "cyrilgdn/postgresql"
      version = ">=1.14.0"
    }
  }
}

variable "postgresql_host" {
  default = "127.0.0.1"
}
variable "postgresql_port" {
  default = "5432"
}
variable "postgresql_username" {
  default = "postgres"
}
variable "postgresql_password" {
  sensitive = true
  default = "password"
}
variable "postgresql_database" {
  default = "postgres"
}

resource "random_password" "rouser_password" {
  length           = 64
  override_special = "!#@"
  lifecycle {
    ignore_changes = all
  }
}
resource "random_password" "monitoruser_password" {
  length           = 64
  override_special = "!#@"
  lifecycle {
    ignore_changes = all
  }
}

provider "postgresql" {
  host              = var.postgresql_host
  port              = var.postgresql_port
  database          = var.postgresql_database
  username          = var.postgresql_username
  password          = var.postgresql_password
  sslmode           = "disable"
  connect_timeout   = 15
  max_connections   = 1
  superuser         = false
}

resource "postgresql_role" "ro_user" {
  name     = "ro"
  login    = true
  password = random_password.rouser_password.result
  inherit  = true

  roles = [
    postgresql_role.fs_readonly.id,
  ]
}

resource "postgresql_role" "fs_readonly" {
  name  = "d_fs_readonly"
  login = false

  lifecycle {
    ignore_changes = [
      roles,
    ]
  }
}

resource "postgresql_grant" "fs_readonly_db_grant" {
  database    = "postgres"
  role        = postgresql_role.fs_readonly.name
  object_type = "database"
  privileges  = ["CONNECT"]
}

resource "postgresql_grant" "fs_readonly_schema_grant" {
  database    = "postgres"
  role        = postgresql_role.fs_readonly.name
  schema      = "public"
  object_type = "schema"
  privileges  = ["USAGE"]
}

Debug Output

https://gist.github.com/mochja/de0700074dce69b3b0ba992ddaf43718

Expected Behavior

State should be correctly refreshed at all times.

Actual Behavior

Refresh state sometimes hangs indefinitely.

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. terraform apply
  2. terraform apply
  3. terraform apply
  4. terraform apply
  5. terraform apply

Important Factoids

Running on Postgres 10. Able to build & apply correct plan with -refresh=false on subsequent runs.

mochja avatar Dec 02 '21 15:12 mochja

We seem to be having this same issue with 12.4, using cloudposse's atmos, atmost hangs indefinitely while refreshing the state of postgres.

evanrich avatar Jan 10 '22 18:01 evanrich

@mochja Thanks for opening this issue and sorry for the response delay.

Your simple test code allowed me to find out what is the issue (We already encountered this problem without being able to reproduce it clearly). It's not really linked to the fact that there's multiple grant on single role but just the multiple grant on schema on postgres database that are able to be deadlock with a small max_connections because of a bad connections management.

I already have a fix for that, I'll try to publish it this week.

cyrilgdn avatar Jan 30 '22 22:01 cyrilgdn

Any update on this? I can confirm that the issue is exactly as described. Running terraform with TF_LOG=trace shows a deadlock as the tree is traversed.

I encountered this because I tried to use a max_connections value of 1 to force the provider to provision everything sequentially to avoid a bug where concurrent grants produce tuple concurrently updated. Is there a different way to achieve the same thing? Using -parallelism=1 is not really an option given the huge impact it will have on performance.

jarpoole avatar May 29 '22 22:05 jarpoole

@mochja Thanks for opening this issue and sorry for the response delay.

Your simple test code allowed me to find out what is the issue (We already encountered this problem without being able to reproduce it clearly). It's not really linked to the fact that there's multiple grant on single role but just the multiple grant on schema on postgres database that are able to be deadlock with a small max_connections because of a bad connections management.

I already have a fix for that, I'll try to publish it this week.

Any update on this? Currently running into this same problem and a fix (or a good workaround, really) would be highly appreciated!

grelland avatar Jul 01 '22 15:07 grelland

Hi,

Any update on this? The only way so far to work around it is to set terraform -parallelism=1 but the refresh and apply of the resources takes a long time on databases with large numbers of users

mkuchniak avatar Oct 03 '23 09:10 mkuchniak

https://github.com/cyrilgdn/terraform-provider-postgresql/pull/351 may be related.

mochja avatar Mar 07 '24 09:03 mochja

Hi, Thanks for 1.22. Do you know if this is solved by 1.22? Our current workaround includes adding a (ugly but working) series of time_wait... Thx

joaocc avatar May 13 '24 09:05 joaocc