terraform-provider-databricks icon indicating copy to clipboard operation
terraform-provider-databricks copied to clipboard

[ISSUE] Issue with `databricks_mws_permission_assignment` resource - very unstable

Open HJatMobi opened this issue 1 year ago • 8 comments

I have logic, that based on a configuration file assignes different permissions to workspaces.

In the problematic run, there was one new workspace for which 4 assignments should had been created.

Configuration

resource "databricks_mws_permission_assignment" "add_serviceadmin_as_admin" {
  provider = databricks.azure_account

  for_each     = local.all_workspace_admin_access
  workspace_id = each.value.workspace_id

  principal_id = data.databricks_group.service_admin.id
  permissions  = ["ADMIN"]
  depends_on = [
    databricks_metastore_assignment.toprod
  ]
}

resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin" {
  provider = databricks.azure_account

  for_each     = local.all_workspaces
  workspace_id = each.value.workspace_id

  principal_id = data.databricks_service_principal.had_gitlab_runners_spn.sp_id
  permissions  = ["ADMIN"]
  depends_on = [
    databricks_metastore_assignment.toprod
  ]
}

resource "databricks_mws_permission_assignment" "add_user_monitor" {
  provider = databricks.azure_account

  for_each = databricks_group.data_scientist_user_group

  workspace_id = lookup(local.datascientist_role_dsw_lookup, each.value.display_name)
  principal_id = each.value.id
  permissions  = ["USER"]
}

resource "databricks_mws_permission_assignment" "add_mi_model_access" {
  provider = databricks.azure_account

  for_each = databricks_group.mi_model_registry_access_group

  workspace_id = lookup(local.mi_model_access_role_dsw_lookup, each.value.display_name)
  principal_id = each.value.id
  permissions  = ["USER"]
}

Expected Behavior

All three permission should be assigned to a workspace without any problems.

Actual Behavior

Terraform plan was always succesful:

# databricks_mws_permission_assignment.add_had_gitlab_runner_as_admin["lcs-pretention-hadronpoc-dsw"] will be created
  + resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin" {
      + id           = (known after apply)
      + permissions  = [
          + "ADMIN",
        ]
      + principal_id = ZZZZZZZZZZ
      + workspace_id = XXXXXXXXX039
    }
  # databricks_mws_permission_assignment.add_mi_model_access["lcs-pretention-hadronpoc-dsw"] will be created
  + resource "databricks_mws_permission_assignment" "add_mi_model_access" {
      + id           = (known after apply)
      + permissions  = [
          + "USER",
        ]
      + principal_id = (known after apply)
      + workspace_id = XXXXXXXXX039
    }
  # databricks_mws_permission_assignment.add_serviceadmin_as_admin["lcs-pretention-hadronpoc-dsw"] will be created
  + resource "databricks_mws_permission_assignment" "add_serviceadmin_as_admin" {
      + id           = (known after apply)
      + permissions  = [
          + "ADMIN",
        ]
      + principal_id = YYYYYYYY
      + workspace_id = XXXXXXXXX039
    }
  # databricks_mws_permission_assignment.add_user_monitor["lcs-pretention-hadronpoc-dsw"] will be created
  + resource "databricks_mws_permission_assignment" "add_user_monitor" {
      + id           = (known after apply)
      + permissions  = [
          + "USER",
        ]
      + principal_id = (known after apply)
      + workspace_id = XXXXXXXXX039
    }

When a new workspace was added in the configuration, the three out of four permissions that should have assigned, did fail. Again, all permissions were for the same workspace.

╷ │ Error: cannot create mws permission assignment: Permission assignments not supported for current workspace. │ │ with databricks_mws_permission_assignment.add_had_gitlab_runner_as_admin["lcs-pretention-hadronpoc-dsw"], │ on main.tf line 222, in resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin": │ 222: resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin" { │ ╵ ╷ │ Error: cannot create mws permission assignment: Permission assignments not supported for current workspace. │ │ with databricks_mws_permission_assignment.add_user_monitor["lcs-pretention-hadronpoc-dsw"], │ on main.tf line 262, in resource "databricks_mws_permission_assignment" "add_user_monitor": │ 262: resource "databricks_mws_permission_assignment" "add_user_monitor" { │ ╵ ╷ │ Error: cannot create mws permission assignment: Permission assignments not supported for current workspace. │ │ with databricks_mws_permission_assignment.add_mi_model_access["lcs-pretention-hadronpoc-dsw"], │ on main.tf line 272, in resource "databricks_mws_permission_assignment" "add_mi_model_access": │ 272: resource "databricks_mws_permission_assignment" "add_mi_model_access" { │ ╵

After the first retry, it was still the same behaviour.

After the second retry, two additional permissions could be assigned and only one still failed:

╷ │ Error: cannot create mws permission assignment: Permission assignments not supported for current workspace. │ │ with databricks_mws_permission_assignment.add_had_gitlab_runner_as_admin["lcs-pretention-hadronpoc-dsw"], │ on main.tf line 222, in resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin": │ 222: resource "databricks_mws_permission_assignment" "add_had_gitlab_runner_as_admin" { │ ╵

After three more retries, also the last assignment did finally succeed.

Steps to Reproduce

Terraform apply

Terraform and provider versions

Initializing provider plugins...

  • Finding hashicorp/azurerm versions matching "3.74.0"...
  • Finding databricks/databricks versions matching "1.27.0"...
  • Finding azure/azapi versions matching "1.9.0"...
  • Finding terraform-example.com/customprovider/iiq versions matching "0.0.1"...
  • Installing hashicorp/azurerm v3.74.0...
  • Installed hashicorp/azurerm v3.74.0 (signed by HashiCorp)
  • Installing databricks/databricks v1.27.0...
  • Installed databricks/databricks v1.27.0 (self-signed, key ID 92A95A66446BCE3F)
  • Installing azure/azapi v1.9.0...
  • Installed azure/azapi v1.9.0 (signed by a HashiCorp partner, key ID 6F0B91BDE98478CF)
  • Installing terraform-example.com/customprovider/iiq v0.0.1...
  • Installed terraform-example.com/customprovider/iiq v0.0.1 (unauthenticated) Partner and community providers are signed by their developers.

HJatMobi avatar Sep 28 '23 09:09 HJatMobi

@HJatMobi I notice that 2 databricks_mws_permission_assignment has depends_on, but 2 of them do not. Could that be the root cause?

nkvuong avatar Oct 16 '23 13:10 nkvuong

@nkvuong That would only explain if it did fail the first time, but it would not explain why I had to retry 5 times until finally all permission assignments could successfully apply.

Moreover, all permission assignments were for the same workspace, so it would also not explain why on the first run one of the assignments could be created, after two more retries two additional were created and after 5 retries in total also the last was finally created.

HJatMobi avatar Oct 17 '23 05:10 HJatMobi

I am also getting the same issue. For me, even after retrying no permissions are getting assigned. All of them are failing with

Error: cannot create mws permission assignment: Permission assignments not supported for current workspace.

kirandw avatar Jan 09 '24 20:01 kirandw

Same as @kirandw and unfortunately this is proving to be a big blocker for me. Any chance this issue could get assigned and looked at?

timharsch avatar Mar 05 '24 22:03 timharsch

@timharsch The problem seems to be coming from the way the permissions are getting created. Before we can start assigning permissions, it seems that the workspace has to be assigned to Unity Catalog. Once I assigned the workspace to the Unity Catalog, databricks_mws_permissions seems to be working as expected. Just try that

kirandw avatar Mar 06 '24 01:03 kirandw

Same thing happened to me, even after Unity Catalog was assigned to the workspace it continued to crash for a few hours.

ruloweb avatar Mar 26 '24 10:03 ruloweb

I've also run into this issue. It feels like an eventual-consistency issue after creating the workspace, where it can take between a few minutes to hours before you're able to apply permissions.

borrell avatar Jun 05 '24 22:06 borrell

Also facing this issue, tried @kirandw suggestion but that didn't work. Eventually it goes through

ChristianGroentved avatar Oct 03 '24 19:10 ChristianGroentved