terraform-provider-databricks icon indicating copy to clipboard operation
terraform-provider-databricks copied to clipboard

[ISSUE] Issue with databricks_cluster immediately after workspace creation

Open jiropardo opened this issue 1 year ago • 0 comments

Configuration

# Copy-paste your Terraform configuration here

module "workspace" { source = "./modules/workspace" databricks_account_id = var.databricks_account_id region = var.region }

module "catalogs_binding" { source = "./modules/catalog_bin" workspace_id = module.workspace.workspace_id

depends_on = [ module.workspace ]

}

module "proxy" {

source = "./modules/proxy_cluster" depends_on = [ module.workspace ]

}

and the cluster resource within proxy_cluster is

resource "databricks_cluster" "git_proxy" { autotermination_minutes = 0 aws_attributes { ebs_volume_count = 1 ebs_volume_size = 32 first_on_demand = 1 } cluster_name = var.git_proxy_name custom_tags = { "ResourceClass" = "SingleNode" } provider = databricks.workspace spark_version = data.databricks_spark_version.latest_lts.id node_type_id = data.databricks_node_type.smallest.id num_workers = 0 spark_conf = { "spark.databricks.cluster.profile" : "singleNode", "spark.master" : "local[*]", } spark_env_vars = { "GIT_PROXY_ENABLE_SSL_VERIFICATION" : "False" "GIT_PROXY_HTTP_PROXY" : "git_URL" } timeouts { create = "30m" update = "30m" delete = "30m" } }

Expected Behavior

Cluster should be created or a more verbose error could be displayed to explain that even though the API returns workspace is RUNNING, it is not yet fully operational

Actual Behavior

Workspace was running

2024-04-10T19:35:44.640-0600 [DEBUG] provider.terraform-provider-databricks_v1.39.0: GET /api/2.0/accounts/XXX/workspaces/XXX < HTTP/2.0 200 OK < { ... < "workspace_status": "RUNNING",

but worker environment is not recognized so cluster creation fails

2024-04-10T19:36:17.232-0600 [ERROR] provider.terraform-provider-databricks_v1.39.0: Response contains error diagnostic: @module=sdk.proto diagnostic_detail="" diagnostic_severity=ERROR tf_proto_version=5.4 tf_req_id=XXX tf_rpc=ApplyResourceChange @caller=/home/runner/work/terraform-provider-databricks/terraform-provider-databricks/vendor/github.com/hashicorp/terraform-plugin-go/tfprotov5/internal/diag/diagnostics.go:58 diagnostic_summary="cannot create cluster: XXX is not able to transition from TERMINATED to RUNNING: worker env WorkerEnvId(workerenv-XXXXX) not found

*Note that I replaced the actual values in above logs with XXXXX

Steps to Reproduce

1.39

Is it a regression?

Debug Output

Important Factoids

Would you like to implement a fix?

Below dependency flow addresses this behavior. It seems this little time is enough for the workspace information to fully propagate

module "proxy" {

source = "./modules/proxy_cluster"

depends_on = [ module.workspace, module.catalogs_binding ] # fixes issue

}

jiropardo avatar Apr 11 '24 02:04 jiropardo