terraform-provider-databricks
terraform-provider-databricks copied to clipboard
[ISSUE] Cannot move cluster to auto-az
Configuration
resource "databricks_cluster" "test_cluster" {
cluster_name = "test_cluster"
spark_version = data.databricks_spark_version.latest_lts.id
node_type_id = "r5.large"
driver_node_type_id = "r5.large"
autotermination_minutes = 20
num_workers = 5
aws_attributes {
availability = "SPOT"
zone_id = "auto"
first_on_demand = 0
spot_bid_price_percent = 100
ebs_volume_type = "GENERAL_PURPOSE_SSD"
ebs_volume_count = 1
ebs_volume_size = 100
}
enable_elastic_disk = true
}
Expected Behavior
Changing "zone_id" from a specific value (ex: "us_east_1a") to "auto" actually changes the zone configuration for this cluster.
Actual Behavior
Changing "zone_id" from a specific value (ex: "us_east_1a") to "auto" does nothing. The zone stays on the previous specific value. In our case, this blocks us because we cannot start the cluster in this zone as we get AWS insufficient capacity errors. So we really want to move to "auto", even if that means recreating the cluster.
Seems like this behaviour was introduced with https://github.com/databricks/terraform-provider-databricks/pull/937 to prevent Terraform to restart a cluster which AwsAttributes.zone_id = "auto", as this is an unneeded/unwanted behavior., but we would argue that we really want the zone change to be applied.
Steps to Reproduce
- Define a cluster with a specific zone (ex:
aws_attributes { zone_id = "us_east_1a" }) - Apply
- Move to auto-az (
aws_attributes { zone_id = "auto" }) - Apply
Terraform and provider versions
databricks/databricks 1.6.1
We are experiencing this issue as well across 80+ AWS clusters. Manually have to edit the attribute in Databricks
Thank you for the feature request! Currently, the team operates in a limited capacity, carefully prioritizing, and we cannot provide a timeline to implement this feature. Please make a Pull Request if you'd like to see this feature sooner, and we'll guide you through the journey.
I have tried using auto in my databricks job (which also create clusters) and it works. The doc doesn't mention it, but it worked. I was using 1.6.5 when I tried it.
I am currently facing this issue when deploying the Databricks Asset Bundle using GitHub CI/CD.
Using databricks/setup-cli@main 0.221.1 version to deploy the bundle.
I need to manually update the zone_id to update it to auto.
Confirming this is still an issue on the latest provider version:
$ terraform version
Terraform v1.8.5
on darwin_amd64
+ provider registry.terraform.io/databricks/databricks v1.47.0
If the cluster exists and is set to a non-auto availability zone, but your template specifies auto, Terraform does not update the cluster.
Use of databricks/setup-cli@main with v0.219.0 works fine. I can update from non-auto AZ to auto.
The CLI uses the below versions
Terraform: 1.5.5
Databricks: 1.40.0
Using Terraform v1.7.2 and Databricks provider v1.59.0:
If I create a cluster with zone_id specified in the Terraform configuration, the zone ID is correctly set, and any subsequent changes to the zone_id in the configuration are applied as expected.
However, if I initially create a cluster without specifying a zone_id in the configuration and later attempt to set or modify it using Terraform, the zone_id does not update.
I am wondering if this is being caused by the ZoneDiffSupression: https://github.com/databricks/terraform-provider-databricks/blob/e98eee711c3a5e8670394e7c86d4b6cb5dbcec92/clusters/resource_cluster.go#L105-L111