terraform-provider-boundary icon indicating copy to clipboard operation
terraform-provider-boundary copied to clipboard

boundary_role resources fail to create on set-grant calls

Open archen opened this issue 3 years ago • 2 comments

The Terraform Boundary Provider (v1.0.1 and v1.0.2) fails to create roles when grant strings exist in the HCL.

Terraform Version

Terraform v1.0.1 on linux_arm64

  • provider registry.terraform.io/hashicorp/boundary v1.0.1

Affected Resource(s)

  • boundary_role

Terraform Configuration Files

terraform {
  required_providers {
    boundary = {
      source  = "hashicorp/boundary"
      version = "1.0.1"
    }
  }
}

provider "boundary" {
  addr             = var.boundary_addr
  recovery_kms_hcl = file("boundary_recovery.hcl")
}

resource "boundary_scope" "global" {
  global_scope = true
  name         = "global"
  scope_id     = "global"
}

resource "boundary_scope" "org" {
  scope_id    = boundary_scope.global.id
  name        = "organization"
  description = var.organization

  auto_create_admin_role   = false
  auto_create_default_role = false
}

resource "boundary_scope" "project" {
  name                     = "project"
  description              = var.ou
  scope_id                 = boundary_scope.org.id
  auto_create_admin_role   = false
  auto_create_default_role = false
}

resource "boundary_role" "global_anon_listing" {
  scope_id = boundary_scope.global.id
  grant_strings = [
    "id=*;type=auth-method;actions=list,authenticate",
    "id=*;type=scope;actions=list,no-op",
    "id={{account.id}};actions=read,change-password"
  ]
  principal_ids = ["u_anon"]
}

resource "boundary_role" "org_anon_listing" {
  scope_id = boundary_scope.org.id
  grant_strings = [
    "id=*;type=auth-method;actions=list,authenticate",
    "type=scope;actions=list",
    "id={{account.id}};actions=read,change-password"
  ]
  principal_ids = ["u_anon"]
}

Debug Output

https://gist.github.com/archen/09650a159f6934d503383cdb08265276

Expected Behavior

I expected the anonymous roles for global and org level scopes to be created with the grant strings applied. During debugging, I succeeded in creating the roles with the equivalent command line calls from the documentation:

# Create global anonymous listing role
$ boundary roles create -name 'global_anon_listing' \
  -recovery-config /tmp/recovery.hcl \
  -scope-id 'global'

$ boundary roles add-grants -id <global_anon_listing_id> \
  -recovery-config /tmp/recovery.hcl \
  -grant 'id=*;type=auth-method;actions=list,authenticate' \
  -grant 'id=*;type=scope;actions=list,no-op' \
  -grant 'id={{account.id}};actions=read,change-password'

$ boundary roles add-principals -id <global_anon_listing_id> \
  -recovery-config /tmp/recovery.hcl \
  -principal 'u_anon'

Actual Behavior

│ Error: error setting grant strings on role: error performing client request during SetGrants call: context deadline exceeded
│
│   with boundary_role.global_anon_listing,
│   on main.tf line 63, in resource "boundary_role" "global_anon_listing":
│   63: resource "boundary_role" "global_anon_listing" {
│
╵
╷
│ Error: error setting grant strings on role: error performing client request during SetGrants call: context deadline exceeded
│
│   with boundary_role.org_anon_listing,
│   on main.tf line 73, in resource "boundary_role" "org_anon_listing":
│   73: resource "boundary_role" "org_anon_listing" {
│
╵

Steps to Reproduce

  1. boundary database init -skip-initial-login-role-creation -config /etc/boundary.d/boundary.hcl
  2. boundary server -config /etc/boundary.d/boundary.hcl
  3. terraform apply -auto-approve

Important Factoids

  1. The command line equivalent works, but the terraform provider fails.
  2. All nodes in the setup are raspberry pi 4's running the arm64 variant of Raspbian.
  3. Postgres 11 is installed from the default raspbian repos.

archen avatar Jun 28 '21 22:06 archen

To put a very fine point on the issue, it is strictly the grant_strings portion of the boundary_role resource that fails on any operation that modifies grant strings. The grant_scope_id is unaffected.

Again, theCLI works fine, but the boundary_role resource with grant_strings causes the Boundary server to 500. Evidence seen in the logs here after a terraform run without the grant_strings lines after applying the grants to the resources with the CLI.

For anyone else running into this issue...

My interim solution is to manage "as much as possible" in the terraform and apply the grant strings post-terraform apply. Automatically deciding "if" to run the grants commands is not considered in the following solution; as I'm using another tool to control the execution flow.

BIG CAVEAT as-is this only works as a one-off run. Subsequent terraform apply executions will attempt to reconcile the grant_strings on the in-place resources with the terraform. As the grant_strings are no longer in the terraform this means that terraform apply will attempt to delete the grants... and the run will fail due to the issue with the resource. You can hack around this issue by replacing the grant_strings after the initial "bootstrapping" terraform run for the roles, but bear in mind that any changes to the resources that tickle the grant modification will fail.

Using the documentation's global anonymous role as our example:

  1. Remove grant_strings from the terraform
resource "boundary_role" "global_anon_listing" {
  scope_id = "global"
  principal_ids = ["u_anon"]
}
  1. Add terraform output stanzas for each role
output "global_anon_listing_id" {
  value = boundary_role.global_anon_listing.id
}
  1. terraform apply
  2. Use the terraform output and the boundary cli to apply the grants
$  export GLOBAL_ANON_ID=$(terraform output -raw global_anon_listing_id) 
$ boundary roles add-grants -id $GLOBAL_ANON_ID \
  -recovery-config /tmp/recovery.hcl \
  -grant 'id=*;type=auth-method;actions=list,authenticate' \
  -grant 'id=*;type=scope;actions=list,no-op' \
  -grant 'id={{account.id}};actions=read,change-password'
  1. OPTIONAL: Replace the grant_strings in the terraform to prevent future failures of terraform apply
resource "boundary_role" "global_anon_listing" {
  scope_id = "global"
  grant_strings = [
    "id=*;type=auth-method;actions=list,authenticate",
    "id=*;type=scope;actions=list,no-op",
    "id={{account.id}};actions=read,change-password"
  ]
  principal_ids = ["u_anon"]
}

archen avatar Jun 29 '21 14:06 archen

tl;dr: removing max_open_connections from the boundary configuration database stanza resolves this issue and more.

Deeper down the rabbit hole... I got suspicious that the issue I was having is not directly related to this provider, but to boundary itself. After getting my janky CLI hack above to work, I tried to use my provisioned boundary ssh targets. I was unable to connect to any of them and trying to do so produced suspiciously similar error messages in the boundary logs (db transaction errors). Trying to use the boundary web ui for administration was failing similarly as well.

I removed the max_open_connections = 2 from my boundary configuration database stanza, and everything works. I'll work on a writeup at some point to submit to the boundary core project but I still hold this as a bit of a problem with the provider due to the inconsistent behavior between the two methods of modifying grant_strings; so I'll leave the ticket open here.

I'm not sure how this provider tickles the APIs differently (or likely just holds connections from the pool in the backend) but even if the only change to be enacted by the terraform is manipulating grant_strings on a boundary_role, the apply will fail when the max_open_connections is set to 2 (the minimum value noted in the config docs).

archen avatar Jul 01 '21 15:07 archen

I believe this has been fixed when https://github.com/hashicorp/boundary/issues/1374 was addressed.

jimlambrt avatar Jan 27 '24 21:01 jimlambrt