terraform-provider-postgresql icon indicating copy to clipboard operation
terraform-provider-postgresql copied to clipboard

feat: Added column level access

Open kda-jt opened this issue 3 years ago • 28 comments

This PR adds the ability to manage privileges on a per column basis.

General overview

This PR focusses on the postgresql_grant resources. It adds the value column as a new object_type, along with a new argument columns. It does not change the behaviour for other object_types in this resource.

Example:

resource "postgresql_grant" "grant" {
  database    = "test_database"
  role        = "test_role"
  schema      = "public"
  object_type = "column" # new object_type
  objects     = ["test_table"]
  columns     = ["col1", "col2"] # new argument
  privileges  = ["UPDATE"]
}

To simplify the code, when object_type is column, only one privileges is allowed, and only one table is allowed in the objects argument.

Important changes

Here are the changes to the code that were the most tricky for me to work out.

The readRolePrivileges SQL statement

We needed an SQL statement that could detect column level privileges, without those resulting from table level privileges. We achieved this in the following way:

  • Fetch all column level permissions from the information_schema.column_privileges table. Let's call this "A"
  • Fetch table level permissions, "explode" them into rows. Let's call this "B"
  • Subtract "B" from "A" to obtain only the column level permissions.

Changing privileges in the postgresql_grant resource to ForceNew

Originally, the privileges argument did not force recreation of the resource. This was a problem because it meant that when changing the privileges in a grant resource, the update function would be triggered and would receive only the new configuration. So the revocation would not revoke the old permissions, but the new one, which is not very useful.

Let's look at an example.

# the base resource
resource "postgresql_grant" "example" {
...
  object_type = "column"
  objects     = ["test_table"]
  columns     = ["col1"]
  privileges  = ["UPDATE"]
}

We want to change the privilege to INSERT

# the base resource
resource "postgresql_grant" "example" {
...
  object_type = "column"
  objects     = ["test_table"]
  columns     = ["col1"]
  privileges  = ["INSERT"]
}

The code would apply and work fine. However, when we'd look at the permissions in the PG DB, we could see that the test_role had both INSERT & UPDATE on that column.

This is because the revocation statement that ran was REVOKE INSERT ON TABLE... instead of REVOKE UPDATE ON TABLE....

I could not find a way to fetch the privilege stored in the state, & setting the argument to ForceNew solved this problem. So I did that.

Changing revoking statements for object_type table

Previously, with object_type table, any change on the table resource meant that a revocation statement would run and revoke all privileges on the table before granting new ones.

This meant that, in the following configuration destroying table_grant would taint 'column_grant'

resource "postgresql_grant" "table_grant" {
...
  object_type = "table"
  objects     = ["test_table"]
  privileges  = ["SELECT" ]
}
resource "postgresql_grant" "column_grant" {
...
  object_type = "column"
  objects     = ["test_table"]
  columns     = ["col1"]
  privileges  = ["INSERT"]
}

To prevent this, I needed to modify the revocation statements for tablesto make them more fine grained.

Testing

This PR was tested thoroughly. I added all the required test cases, and did a lot of manual testing with various scenarios to make sure everything worked as expected. A few of the scenarios I tried:

  • creating, destroying, and recreating the same column grant
  • changing the privileges in a column grant resource and applying multiple times
  • changing the privileges in column grant resource while having a table grant resource on the same table
  • creating, destroying, and recreating the same column grant, while having a table grant resource on the same table
  • creating and destroying table grants while having column grants

For each scenario, I constantly checked in the database to see what permissions were effectively available, and reran terraform plan to see if the terraform refresh were good.

I tried these scenarios on various PostgreSQL versions running in a local Docker container :

  • 10.18.0
  • 11.13.0
  • 12.8.0
  • 13.4.0

Closing words

I'm not a Go developer & this is my first Terraform provider patch. I'm also not a PostgreSQL expert.

I tried my best however to make a high quality PR that would benefit my organisation & hopefully others as well.

@cyrilgdn I'm available to discuss this PR in a virtual meeting if you'd like.

kda-jt avatar Sep 07 '21 17:09 kda-jt

Thanks for this! Unfortunately in my testing it doesn't appear to correctly track state discrepancies, e.g. if I manually run REVOKE for a particular column, refreshing doesn't update the state to handle the change.

bismark avatar Sep 28 '21 17:09 bismark

Any reason this explicitly excludes system columns? For example, I've got a use case for managing access to xmin.

bismark avatar Oct 06 '21 16:10 bismark

Another issue I found: if columns are removed from the table and I remove them from the resource, terraform apply fails with

Error: could not execute revoke query: pq: column "foo" of relation "bar" does not exist

bismark avatar Oct 13 '21 01:10 bismark

Hi @bismark Thanks for all your feedback. I didn't have enough time to get back to working on this during the last few weeks. I'll get back to it, to try and answer your questions & push an update

kda-jt avatar Oct 28 '21 12:10 kda-jt

@bismark

Thanks for this! Unfortunately in my testing it doesn't appear to correctly track state discrepancies, e.g. if I manually run REVOKE for a particular column, refreshing doesn't update the state to handle the change.

You're absolutely right. I've just pushed a commit that should fix this behaviour.

Any reason this explicitly excludes system columns? For example, I've got a use case for managing access to xmin.

I don't see any reason why. I removed that bit.

Another issue I found: if columns are removed from the table and I remove them from the resource, terraform apply fails with

Error: could not execute revoke query: pq: column "foo" of relation "bar" does not exist

The changes I just added should cover this scenario.

kda-jt avatar Oct 29 '21 16:10 kda-jt

@cyrilgdn could I get your input on this ?

kda-jt avatar Nov 05 '21 10:11 kda-jt

@kda-jt I triggered the tests and will take a look as soon as possible.

Thanks for your work on that :+1:

cyrilgdn avatar Nov 08 '21 11:11 cyrilgdn

@cyrilgdn the linting test failed. So I fixed the issues and pushed a new commit. Should be fine now.

kda-jt avatar Nov 09 '21 11:11 kda-jt

@kda-jt I pulled your latest changes, but unfortunately it appears broken: the resources are always marked as required to be destroyed then recreated. it shows all columns as needing to be added, even after applying.

bismark avatar Nov 12 '21 00:11 bismark

@kda-jt I pulled your latest changes, but unfortunately it appears broken: the resources are always marked as required to be destroyed then recreated. it shows all columns as needing to be added, even after applying.

It seems I cannot replicate the behaviour you're experiencing. Here's what I tried:

  1. Postgre running in a local docker container using this command
docker run -p 5432:5432 -e POSTGRESQL_PASSWORD=useless_password --name postgresql bitnami/postgresql:latest
  1. Logged into the PG database & created a table
PGPASSWORD='useless_password' psql -h localhost -p 5432 -U postgres # login
# create table
CREATE TABLE accounts (                                                                 
 user_id serial PRIMARY KEY, uselesscol VARCHAR,
 username VARCHAR ( 50 ) UNIQUE NOT NULL
);
  1. In an empty directory, created a file called main.tf containing the following code, and ran terraform init && terraform apply
provider "postgresql" {
  host      = "localhost"
  port      = 5432
  database  = "postgres"
  username  = "postgres"
  password  = "useless_password"
  sslmode   = "disable"
  superuser = false
}

resource "postgresql_role" "user_role" {
  name     = "user"
  login    = true
  password = "user"
}

resource "postgresql_grant" "grant_col" {
  database    = "postgres"
  role        = "user"
  schema      = "public"
  object_type = "column" # new object_type
  objects     = ["accounts"]
  columns     = ["username", "uselesscol"] 
  privileges  = ["SELECT"]
}
  1. Ran terraform plan and got
No changes. Your infrastructure matches the configuration.

@bismark Unless you can provide a thorough explanation of how things are broken, and a way of replicating that behaviour, I'm afraid I won't be able to do much.

kda-jt avatar Nov 14 '21 17:11 kda-jt

@bismark Unless you can provide a thorough explanation of how things are broken, and a way of replicating that behaviour, I'm afraid I won't be able to do much.

Yah, I'm not too sure. With the latest commit, no matter what I try, the resources are marked as needing replacement every time. Example:

next_staging=> revoke select ON user_settings from foo;
REVOKE
next_staging=> \dp user_settings
                                               Access privileges
 Schema |     Name      | Type  |     Access privileges     |           Column privileges           | Policies
--------+---------------+-------+---------------------------+---------------------------------------+----------
 public | user_settings | table | next_web=arwdDxt/next_web | id:                                  +|
        |               |       |                           |   analytics=r/next_web               +|

First run:

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # postgresql_grant.grant_foo_select_columns["user_settings"] will be created
  + resource "postgresql_grant" "grant_foo_select_columns" {
      + columns           = [
          + "id",
        ]
      + database          = "next_staging"
      + id                = (known after apply)
      + object_type       = "column"
      + objects           = [
          + "user_settings",
        ]
      + privileges        = [
          + "SELECT",
        ]
      + role              = "foo"
      + schema            = "public"
      + with_grant_option = false
    }

Plan: 1 to add, 0 to change, 0 to destroy.

Success:

next_staging=> \dp user_settings
                                               Access privileges
 Schema |     Name      | Type  |     Access privileges     |           Column privileges           | Policies
--------+---------------+-------+---------------------------+---------------------------------------+----------
 public | user_settings | table | next_web=arwdDxt/next_web | id:                                  +|
        |               |       |                           |   analytics=r/next_web               +|
        |               |       |                           |   foo=r/next_web                     +|

Subsequent runs:

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the last "terraform apply":

  # postgresql_grant.grant_foo_select_columns["user_settings"] has been changed
  ~ resource "postgresql_grant" "grant_foo_select_columns" {
      ~ columns           = [
          - "id",
        ]
        id                = "foo_next_staging_public_column_user_settings_id"
        # (7 unchanged attributes hidden)
    }

Unless you have made equivalent changes to your configuration, or ignored the relevant attributes using ignore_changes, the following plan may include actions to undo or respond to
these changes.

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
-/+ destroy and then create replacement

Terraform will perform the following actions:

  # postgresql_grant.grant_foo_select_columns["user_settings"] must be replaced
-/+ resource "postgresql_grant" "grant_foo_select_columns" {
      ~ columns           = [ # forces replacement
          + "id",
        ]
      ~ id                = "foo_next_staging_public_column_user_settings_id" -> (known after apply)
        # (7 unchanged attributes hidden)
    }

Plan: 1 to add, 0 to change, 1 to destroy.

So for now I'm sticking with 9b0e616ebeb4186fa15d60d854a24dd048c774c5 since it works (despite the limitations I've described above). I suppose we'll hope for more testers!

bismark avatar Nov 22 '21 23:11 bismark

@bismark I'd like to make sure I fully understand the issue you're having.

With the info provided, I can only think of the following scenario: You've created the column grant through Terraform, then "manually" revoked it.

In this case, any subsequent run of terraform plan should mark the resource as to be replaced and that's the normal behaviour. If you run apply (after the manual REVOKE), the column grant will be recreated and subsequent plans shouldn't show any changes.

Do you confirm that's the scenario you're having?

kda-jt avatar Nov 23 '21 09:11 kda-jt

@bismark I'd like to make sure I fully understand the issue you're having.

With the info provided, I can only think of the following scenario: You've created the column grant through Terraform, then "manually" revoked it.

Unfortunately no, the only thing I'm doing between the first terraform apply and the next is running \dp <table>. It's quite a strange behavior, the Terraform state is correctly saying that it was run, but on the subsequent refresh it appears to fail to read the current permissions on the table and thus thinks it has gotten out of sync. I've tried with multiple tables, with various combinations of permissions/roles to see if I'm hitting some edge case.... 🤷🏻‍♂️

bismark avatar Nov 24 '21 04:11 bismark

Okay, crystal clear. I'm however not at all able to reproduce this behaviour.

Could you please tell me:

  • what PostgreSQL version are you running?
  • what kind of Postgre are you running? (AWS RDS? Azure? Vanilla? etc.)
  • is the table you're trying to manage access to "special" in any way? (system table, view?)
  • any other info you think might be relevant.

kda-jt avatar Nov 24 '21 09:11 kda-jt

@cyrilgdn Hi ! I'm available if you'd like to chat about this PR.

kda-jt avatar Dec 09 '21 11:12 kda-jt

@kda-jt @cyrilgdn it will be great if you merge it. I'm working on the RLS feature https://github.com/Tonkonozhenko/terraform-provider-postgresql/commit/c78e3a68f8a9175018457fc94a0a85a688a8f065 It will be great to use both

Tonkonozhenko avatar Dec 28 '21 14:12 Tonkonozhenko

@cyrilgdn I rebased the PR on the latest upstream master, resolved all conflicts & added the suggested test. I'm available to further discuss this PR if needed

kda-jt avatar Feb 07 '22 20:02 kda-jt

Looking forward to seeing this merged, thanks for the contribution, unfortunately still unable to use Terraform management for roles unless column level access is doable.

Quentin-M avatar Mar 09 '22 08:03 Quentin-M

It's been a few months, but I was able to finally revisit this...

  • what kind of Postgre are you running? (AWS RDS? Azure? Vanilla? etc.)

I've narrowed the issue I'm having around column grants always being marked as "must be replaced" to just when I am working with an AWS RDS Postgresql database.

FWIW, I am manually setting up an ssh tunnel prior to running terraform, I haven't tested with a public facing DB yet. I'm really not sure why this would matter, though I'm equally stumped at to why https://github.com/cyrilgdn/terraform-provider-postgresql/commit/9b0e616ebeb4186fa15d60d854a24dd048c774c5 continues to work fine against an RDS DB but the branch head doesn't...

Running against a local database, everything works fine.

  • what PostgreSQL version are you running?

RDS (broken) - 13.4 Local (working) - 13.6

  • is the table you're trying to manage access to "special" in any way? (system table, view?)

Just a plain old table.

bismark avatar Mar 16 '22 05:03 bismark

@cyrilgdn I just pushed an update to fix a bug that was introduced by this change.

I ran the superuser acceptance tests, and they were OK. I don't have access to an RDS DB, so couldn't run the RDS acceptance tests.

Again, I'm available to discuss this PR in a virtual meeting if you'd like.

@bismark I tried running PostgreSQL 13.4 locally to see if I could replicate the issue. I could not. And I don't have access to an AWS environment. You can start investigating yourself by running this SQL statement on both the local and RDS DBs (don't forget to plug in the variables) and comparing results. This could maybe help you suggest a modification that fixes the issue you're having.

kda-jt avatar Mar 29 '22 09:03 kda-jt

@bismark I tried running PostgreSQL 13.4 locally to see if I could replicate the issue. I could not. And I don't have access to an AWS environment. You can start investigating yourself by running this SQL statement on both the local and RDS DBs (don't forget to plug in the variables) and comparing results. This could maybe help you suggest a modification that fixes the issue you're having.

Thanks for the helpful pointer. I'm still digging in a bit, but just at the first glance its clear something is not working: when I query information_schema.column_privileges on RDS, I only get back system tables. My guess is it boils down to the difference between the rds_superuser role that my "superuser" RDS role has as opposed to the actual Superuser privileges that my role has on a local db.

bismark avatar Mar 29 '22 18:03 bismark

Thanks for the helpful pointer. I'm still digging in a bit, but just at the first glance its clear something is not working: when I query information_schema.column_privileges on RDS, I only get back system tables. My guess is it boils down to the difference between the rds_superuser role that my "superuser" RDS role has as opposed to the actual Superuser privileges that my role has on a local db.

I've done a little bit of research on the differences RDS introduces to the "superuser" role, and this could be indeed the reason behind the weirdness you're facing. Apparently, the rds_superuser role is just a regular role with extended permissions on the DB.

More info about rds_superuser vs PG Superuser

On this very helpful page, we can find the result of a \du command on an RDS PG.

postgres=> \du
                                                               List of roles
    Role name    |                         Attributes                         |                          Member of                          
-----------------+------------------------------------------------------------+-------------------------------------------------------------
 postgres        | Create role, Create DB                                    +| {rds_superuser}
                 | Password valid until infinity                              | 
 rds_password    | Cannot login                                               | {}
 rds_replication | Cannot login                                               | {}
 rds_superuser   | Cannot login                                               | {pg_monitor,pg_signal_backend,rds_replication,rds_password}
 rdsadmin        | Superuser, Create role, Create DB, Replication, Bypass RLS+| {}
                 | Password valid until infinity                              | 

The postgres role used is actually not the PG Superuser, because that role is reserved for AWS, so they can manage multi-AZ replications and whatnot. The rdsadmin is the real superuser, meaning that in this specific case postgres is a simple user that can only interact with objects it owns or it has been GRANTed access to.

And, as per this Stackoverflow answer,

Users can only see data in the COLUMNS table of INFORMATION_SCHEMA for tables on which they have read permissions.

This would mean that the rds_superuser role cannot see the column level privileges for tables over which it does not have read access, which could explain why Terraform keeps trying to recreate the resources on every plan.

To confirm whether my understanding is correct, could you please tell me: :one: Is your terraform code using the rds_superuser role in the provider block? :two: Does the rds_superuser own the tables you're trying to interact with through Terraform?

Thanks for your help @bismark

kda-jt avatar Mar 30 '22 13:03 kda-jt

Okay! I managed to get my hands on an Postgre RDS database to test my hypothesis.

First test

For the first test, I used these parameters:

  • Users in the DB: postgres, user, fuser
  • TF applied with user: postgres
  • Table created by user: postgres
  • Column grant given through TF to: fuser

Results:

  • fuser can select the columns in the accounts table
  • terraform plans ✅ work as intended, without trying to recreate the column grant

Second test

For the first test, I used these parameters:

  • Users in the DB: postgres, user, fuser
  • TF applied with user: postgres
  • Table created by user: user ⚠️ The only change between the two setups
  • Column grant given through TF to: fuser

Results:

  • fuser can select the columns in the accounts table
  • terraform plans :x: DO NOT work as intended, constantly trying to recreate the column grant
ℹ️ code snippets used in both tests
CREATE TABLE accounts (
         user_id serial PRIMARY KEY,
         username VARCHAR ( 50 ) UNIQUE NOT NULL,
         uselesscol VARCHAR );
terraform {
  required_providers {
    postgresql = {
      source  = "<local path to built provider>"
      version = "1.16.0"
    }
  }
  required_version = ">= 1.0"
}

provider "postgresql" {
  host      = "something.rds.amazonaws.com"
  port      = 5432
  database  = "somedb"
  username  = "postgres"
  password  = "somepassword"
  superuser = false # necessary for RDS
}

resource "postgresql_grant" "grant_col_select" {
  database    = "somedb"
  role        = "fuser"
  schema      = "public"
  object_type = "column" # new object_type
  objects     = ["accounts"]
  columns     = ["uselesscol", "username"]
  privileges  = ["SELECT"]
}

So the hypothesis is confirmed. The rds_superuser can indeed create the column grant, but cannot see it when running the SQL statement that refreshes the state, and this causes the terraform plan to constantly try to recreate the column grant.

ℹ️ I ran that SQL statement as postgres and as user in both scenarios, and in the first test postgres could see the column grants, but not in the second one.

@bismark it works in the older commit (https://github.com/cyrilgdn/terraform-provider-postgresql/commit/9b0e616ebeb4186fa15d60d854a24dd048c774c5) and not in the later ones simply because I've changed the SQL statement that runs when refreshing the TF state for column grants. Edit: I just tried the old statement & the new one on an RDS, and they both have similar behaviour. Meaning I'm currently not sure why you had this behaviour.

Next steps

I just need to rework that statement to ensure it works with RDS. I'll be doing that soon hopefully.

kda-jt avatar Mar 30 '22 16:03 kda-jt

I'm very interested in getting this functionality published and am willing to help out. Is there anything besides the RDS issue that remains outstanding? Is there a plan for working around the RDS superuser visibility limitation?

wilsonjackson avatar Aug 18 '22 18:08 wilsonjackson

Checking in on this. With the fixes I've applied my company has been using this provider with RDS databases successfully for a couple months now.

@kda-jt If you don't have time to devote to this effort, would you object to me opening a new PR from my fork that includes the fixes I've applied? Not trying to steal credit for your work, of course — you did the heavy lifting. I'd just like to move towards getting this merged. Otherwise, if you have time, my PR to your fork should include all the necessary updates.

wilsonjackson avatar Oct 25 '22 21:10 wilsonjackson

Hi @wilsonjackson I'm terribly sorry, life happens and I couldn't work on this for quite a while. Please, feel free to do whatever you need to get this merged.

Kudos for the work you put in fixing it 💪🏼

kda-jt avatar Oct 27 '22 12:10 kda-jt

@wilsonjackson I've approved & merged your PR on my repo, and also rebased my repo on the upstream.

Hope this helps

kda-jt avatar Oct 27 '22 12:10 kda-jt

Thanks so much @kda-jt! Totally understand about life.

@cyrilgdn With the query fix merged in I think this PR should be in a good spot. Care to re-review? As an aside, perhaps you've noticed already but the latest tag for postgres now pulls 15, which seems to fail the test suite.

wilsonjackson avatar Nov 01 '22 18:11 wilsonjackson

Hi @cyrilgdn 👋🏼

What are your thoughts on this PR now?

kda-jt avatar Dec 06 '22 19:12 kda-jt

Looking forward to this!

hSATAC avatar Jan 13 '23 01:01 hSATAC