terraform-provider-databricks
terraform-provider-databricks copied to clipboard
[ISSUE] Issue with `databricks_mws_workspaces` resource token rotation
There is a similar, but not quite the same existing issue https://github.com/databricks/terraform-provider-databricks/issues/2750 . This issue does not use time_rotating.
Configuration
Believe our code is basically the same as shown at https://registry.terraform.io/providers/databricks/databricks/1.25.1/docs/resources/mws_workspaces
resource "databricks_mws_workspaces" "this" {
provider = databricks.mws
account_id = var.databricks_account_id
workspace_name = var.prefix
aws_region = var.region
credentials_id = databricks_mws_credentials.this.credentials_id
storage_configuration_id = databricks_mws_storage_configurations.this.storage_configuration_id
network_id = databricks_mws_networks.this.network_id
token {}
}
output "databricks_token" {
value = databricks_mws_workspaces.this.token[0].token_value
sensitive = true
}
I would guess it doesn't matter for the error, but in our particular code, we have lifetime_seconds set to 90 days (instead of unset default of 30 days).
Expected Behavior
Terraform apply succeeds.
Actual Behavior
Error that appeared during cronned Terraform apply at 2024-1-9, at 6:00-ish PM, 8:41-ish PM, 9:12-ish PM EST were variants of
Error: cannot read xxx: Invalid access token.`
with xxx,
A subsequent cronned Terraform apply succeded with no intervention.
I don't know if it particularly matters, but I see 2 tokens with creation times around the problem time.
databricks --profile xx token-management list --created-by-username xx --output json | jq 'map((.creation_time, .expiry_time) |= (. / 1000 | strftime("%Y-%m-%d %H:%M:%S")))'
[
{
...
"creation_time": "2024-01-10 11:02:21",
"expiry_time": "2024-04-09 11:02:21",
...
},
{
...
"creation_time": "2024-01-10 01:42:32",
"expiry_time": "2024-04-09 01:42:32",
...
}
]
Databricks support says if the issue happens again, to run with export TF_LOG="DEBUG", delete the tokens, and apply again.
Steps to Reproduce
terraform apply
Terraform and provider versions
terraform version 1.5.7 provider 1.25.1
Is it a regression?
Didn't try any other versions. No notes that I can see of any changes made to databricks_mws_workspaces in provider versions up to 1.36.3.
Debug Output
Important Factoids
Would you like to implement a fix?
What are you using the token for? Are you using the token for anything other than to configure the provider to talk to this new workspace? Reason I ask is that I'm working on a mechanism to allow you to use the account-level provider to manage workspace-level resources. This will eliminate the need to get a token from this resource, and it should eliminate a class of issues that arises when rotating the token created by this resource.
See #3188.
Hmm, the token is used for just about everything to "manage" the workspace, which the doc suggests be done.
Code that creates workspaces and code that manages workspaces must be in separate terraform modules to avoid common confusion between provider = databricks.mws and provider = databricks.created_workspace. This is why we specify databricks_host and databricks_token outputs, that have to be used in the latter modules:
Had another similar terraform failure in a different SDLC. However, in this SDLC, when I checked for tokens with token-management list, there were none. Reapplying addressed the issue. So ...
In the original failure, somehow 2 tokens were created in the midst of 3 apply failures. This latest failure is more straightforward and the existing token simply expired.