terraform-provider-postgresql
terraform-provider-postgresql copied to clipboard
Deadlock on state refresh with multiple grants for single role
Terraform Version
Terraform v1.0.4
on darwin_amd64
+ provider registry.terraform.io/cyrilgdn/postgresql v1.14.0
+ provider registry.terraform.io/hashicorp/random v3.1.0
Affected Resource(s)
- postgresql_grant
Terraform Configuration Files
terraform {
required_version = "~>1.0.0"
required_providers {
postgresql = {
source = "cyrilgdn/postgresql"
version = ">=1.14.0"
}
}
}
variable "postgresql_host" {
default = "127.0.0.1"
}
variable "postgresql_port" {
default = "5432"
}
variable "postgresql_username" {
default = "postgres"
}
variable "postgresql_password" {
sensitive = true
default = "password"
}
variable "postgresql_database" {
default = "postgres"
}
resource "random_password" "rouser_password" {
length = 64
override_special = "!#@"
lifecycle {
ignore_changes = all
}
}
resource "random_password" "monitoruser_password" {
length = 64
override_special = "!#@"
lifecycle {
ignore_changes = all
}
}
provider "postgresql" {
host = var.postgresql_host
port = var.postgresql_port
database = var.postgresql_database
username = var.postgresql_username
password = var.postgresql_password
sslmode = "disable"
connect_timeout = 15
max_connections = 1
superuser = false
}
resource "postgresql_role" "ro_user" {
name = "ro"
login = true
password = random_password.rouser_password.result
inherit = true
roles = [
postgresql_role.fs_readonly.id,
]
}
resource "postgresql_role" "fs_readonly" {
name = "d_fs_readonly"
login = false
lifecycle {
ignore_changes = [
roles,
]
}
}
resource "postgresql_grant" "fs_readonly_db_grant" {
database = "postgres"
role = postgresql_role.fs_readonly.name
object_type = "database"
privileges = ["CONNECT"]
}
resource "postgresql_grant" "fs_readonly_schema_grant" {
database = "postgres"
role = postgresql_role.fs_readonly.name
schema = "public"
object_type = "schema"
privileges = ["USAGE"]
}
Debug Output
https://gist.github.com/mochja/de0700074dce69b3b0ba992ddaf43718
Expected Behavior
State should be correctly refreshed at all times.
Actual Behavior
Refresh state sometimes hangs indefinitely.
Steps to Reproduce
Please list the steps required to reproduce the issue, for example:
-
terraform apply
-
terraform apply
-
terraform apply
-
terraform apply
-
terraform apply
Important Factoids
Running on Postgres 10. Able to build & apply correct plan with -refresh=false
on subsequent runs.
We seem to be having this same issue with 12.4, using cloudposse's atmos, atmost hangs indefinitely while refreshing the state of postgres.
@mochja Thanks for opening this issue and sorry for the response delay.
Your simple test code allowed me to find out what is the issue (We already encountered this problem without being able to reproduce it clearly). It's not really linked to the fact that there's multiple grant on single role but just the multiple grant on schema on postgres
database that are able to be deadlock with a small max_connections
because of a bad connections management.
I already have a fix for that, I'll try to publish it this week.
Any update on this? I can confirm that the issue is exactly as described. Running terraform with TF_LOG=trace
shows a deadlock as the tree is traversed.
I encountered this because I tried to use a max_connections
value of 1
to force the provider to provision everything sequentially to avoid a bug where concurrent grants produce tuple concurrently updated
. Is there a different way to achieve the same thing? Using -parallelism=1
is not really an option given the huge impact it will have on performance.
@mochja Thanks for opening this issue and sorry for the response delay.
Your simple test code allowed me to find out what is the issue (We already encountered this problem without being able to reproduce it clearly). It's not really linked to the fact that there's multiple grant on single role but just the multiple grant on schema on
postgres
database that are able to be deadlock with a smallmax_connections
because of a bad connections management.I already have a fix for that, I'll try to publish it this week.
Any update on this? Currently running into this same problem and a fix (or a good workaround, really) would be highly appreciated!
Hi,
Any update on this? The only way so far to work around it is to set terraform -parallelism=1 but the refresh and apply of the resources takes a long time on databases with large numbers of users
https://github.com/cyrilgdn/terraform-provider-postgresql/pull/351 may be related.
Hi, Thanks for 1.22. Do you know if this is solved by 1.22? Our current workaround includes adding a (ugly but working) series of time_wait... Thx