terraform-provider-databricks
terraform-provider-databricks copied to clipboard
[ISSUE] apply makes immaterial changes to resources, even though HCL file nor resource has changed.
I have a databricks_job resource and related databricks_permission resource defined in the following configuration. After applying the initial changes the first time in terraform apply
I ran a terraform plan
and to my surprise, there were changes listed even though the configuration nor the actual resources themselves changed at all. I can keep running terraform apply
on this configuration over and over and I the same changes listed each time.
Configuration
variable "jobs_runner_service_principal_application_id" {
description = "Application ID of the 'prod-jobs-runner' service principal."
type = string
nullable = false
}
resource "databricks_job" "model_orchestrator" {
name = "prod-model-orchestrator"
run_as {
service_principal_name = var.jobs_runner_service_principal_application_id
}
webhook_notifications {
on_failure {
id = "273c8641-4f21-4438-9d96-03d4a806f24b"
}
}
task {
task_key = "prod-model-preprocess"
run_if = "ALL_SUCCESS"
notebook_task {
source = "GIT"
notebook_path = "services/model/preprocess"
}
library {
pypi {
package = "pydantic==2.5.2"
}
}
library {
pypi {
package = "databricks-sdk==0.22.0"
}
}
job_cluster_key = "prod-model-preprocess_cluster"
}
task {
task_key = "prod-model-polling"
run_if = "ALL_SUCCESS"
notebook_task {
source = "GIT"
notebook_path = "services/model/polling"
}
library {
pypi {
package = "databricks-sdk==0.22.0"
}
}
library {
pypi {
package = "pydantic==2.5.2"
}
}
job_cluster_key = "prod-model-polling_cluster"
depends_on {
task_key = "prod-model-preprocess"
}
}
task {
task_key = "prod-model-combine"
run_if = "ALL_SUCCESS"
notebook_task {
source = "GIT"
notebook_path = "services/model/combine"
}
library {
pypi {
package = "pydantic==2.5.2"
}
}
library {
pypi {
package = "databricks-sdk==0.22.0"
}
}
job_cluster_key = "prod-model-combine_cluster"
depends_on {
task_key = "prod-model-polling"
}
}
max_concurrent_runs = 100
job_cluster {
job_cluster_key = "prod-model-preprocess_cluster"
new_cluster {
spark_version = "12.2.x-scala2.12"
spark_conf = {
"spark.databricks.adaptive.autoOptimizeShuffle.enabled" = "true"
"spark.databricks.cluster.profile" = "singleNode"
"spark.master" = "local[*, 4]"
}
runtime_engine = "STANDARD"
policy_id = "9C6308F7030051E9"
# i3.2xlarge has 8 vCPU and 61 GB RAM
node_type_id = "i3.2xlarge"
enable_elastic_disk = false
data_security_mode = "SINGLE_USER"
custom_tags = {
ResourceClass = "SingleNode"
}
aws_attributes {
zone_id = "auto"
spot_bid_price_percent = 100
first_on_demand = 1
availability = "SPOT_WITH_FALLBACK"
}
}
}
job_cluster {
job_cluster_key = "prod-model-polling_cluster"
new_cluster {
spark_version = "12.2.x-scala2.12"
spark_conf = {
"spark.databricks.cluster.profile" = "singleNode"
"spark.master" = "local[*, 4]"
}
runtime_engine = "STANDARD"
# m5d.large has 2 vCPU and 8 GB RAM
node_type_id = "m5d.large"
enable_elastic_disk = false
data_security_mode = "SINGLE_USER"
custom_tags = {
ResourceClass = "SingleNode"
}
aws_attributes {
zone_id = "us-east-1f"
spot_bid_price_percent = 100
first_on_demand = 1
availability = "SPOT_WITH_FALLBACK"
}
}
}
job_cluster {
job_cluster_key = "prod-model-combine_cluster"
new_cluster {
spark_version = "12.2.x-scala2.12"
spark_conf = {
"spark.databricks.adaptive.autoOptimizeShuffle.enabled" = "true"
"spark.databricks.cluster.profile" = "singleNode"
"spark.master" = "local[*, 4]"
}
runtime_engine = "STANDARD"
policy_id = "9C6308F7030051E9"
# i3.2xlarge has 8 vCPU and 61 GB RAM
node_type_id = "i3.2xlarge"
data_security_mode = "SINGLE_USER"
custom_tags = {
ResourceClass = "SingleNode"
}
aws_attributes {
zone_id = "auto"
spot_bid_price_percent = 100
first_on_demand = 1
availability = "SPOT_WITH_FALLBACK"
}
}
}
git_source {
url = "https://github.com/myorg/myrepo/"
provider = "gitHub"
branch = "main"
}
}
resource "databricks_permissions" "model_orchestrator" {
job_id = databricks_job.model_orchestrator.id
access_control {
service_principal_name = var.jobs_runner_service_principal_application_id
permission_level = "IS_OWNER"
}
access_control {
permission_level = "CAN_MANAGE_RUN"
group_name = "infrastructure"
}
access_control {
permission_level = "CAN_VIEW"
group_name = "users"
}
}
Expected Behavior
After the first apply, when I try to apply again there should be no changes since the resources nor the configuration which defines them has changed.
Actual Behavior
Each time I run terraform apply, I get the same output:
databricks_service_principal.prod_platform_jobs_runner: Refreshing state... [id=5930489811099157]
module.prod_model.databricks_job.model_orchestrator: Refreshing state... [id=501652838178442]
module.prod_model.databricks_permissions.model_orchestrator: Refreshing state... [id=/jobs/501652838178442]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following
symbols:
~ update in-place
Terraform will perform the following actions:
# module.prod_model.databricks_job.model_orchestrator will be updated in-place
~ resource "databricks_job" "model_orchestrator" {
id = "501652838178442"
name = "prod-model-orchestrator"
# (9 unchanged attributes hidden)
~ job_cluster {
# (1 unchanged attribute hidden)
~ new_cluster {
~ enable_elastic_disk = true -> false
# (14 unchanged attributes hidden)
# (1 unchanged block hidden)
}
}
~ task {
~ job_cluster_key = "prod-model-combine_cluster" -> "prod-model-preprocess_cluster"
~ task_key = "prod-model-combine" -> "prod-model-preprocess"
# (5 unchanged attributes hidden)
- depends_on {
- task_key = "prod-model-polling" -> null
}
~ notebook_task {
~ notebook_path = "services/model/combine" -> "services/model/preprocess"
# (2 unchanged attributes hidden)
}
# (5 unchanged blocks hidden)
}
~ task {
~ job_cluster_key = "prod-model-preprocess_cluster" -> "prod-model-combine_cluster"
~ task_key = "prod-model-preprocess" -> "prod-model-combine"
# (5 unchanged attributes hidden)
+ depends_on {
+ task_key = "prod-model-polling"
}
~ notebook_task {
~ notebook_path = "services/model/preprocess" -> "services/model/combine"
# (2 unchanged attributes hidden)
}
# (5 unchanged blocks hidden)
}
# (7 unchanged blocks hidden)
}
# module.test_model.databricks_permissions.model_orchestrator will be updated in-place
~ resource "databricks_permissions" "model_orchestrator" {
id = "/jobs/939182301413075"
# (2 unchanged attributes hidden)
- access_control {
- group_name = "infrastructure" -> null
- permission_level = "CAN_MANAGE_RUN" -> null
}
- access_control {
- group_name = "users" -> null
- permission_level = "CAN_VIEW" -> null
}
+ access_control {
+ permission_level = "IS_OWNER"
+ service_principal_name = "24b3d6ce-fe33-484d-92b4-0484841a38"
}
+ access_control {
+ group_name = "infrastructure"
+ permission_level = "CAN_MANAGE_RUN"
}
+ access_control {
+ group_name = "users"
+ permission_level = "CAN_VIEW"
}
}
Plan: 0 to add, 2 to change, 0 to destroy.
Steps to Reproduce
terraform apply
multiple times
Terraform and provider versions
❯ terraform --version
Terraform v1.5.7
on darwin_arm64
databricks provider version = "1.42.0"
Is it a regression?
No.
Important Factoids
The task ordering issue seems related to https://discuss.hashicorp.com/t/dynamic-task-foreach-order-changes/48699/7
As for the permissions, I do not understand why it removes/re-adds the same access_control
blocks each time...
I has the same issue with you. In databricks_job
resource, it always shows changes to the task when using terraform plan
even I didn't change anything in the config file.
I have the same issue with databricks_permissions for the SQL endpoint. databricks provider version = "1.41.0"
I believe the issue with databricks_permissions
diff may be related to the fact that this resource currently removes the IS_OWNER permission from the state, so Terraform always plans to add the IS_OWNER block. At the same time, we don't want to add IS_OWNER to the state when the user hasn't specified it, otherwise there will be a diff showing IS_OWNER
being removed. I have a PR that might address this by including IS_OWNER in the state when the user explicitly has included it in their configuration, otherwise leaving it out: https://github.com/databricks/terraform-provider-databricks/pull/3956.
We are also seeing the same issue when deploying databricks_job
and databricks_permissions
resources.
For databricks_job
, another parameter that keeps showing as diff in every TF deployment is job-level timeout setting:
# databricks_job.routing_job_node_types will be updated in-place
~ resource "databricks_job" "routing_job_node_types" {
always_running = false
control_run_state = false
format = "MULTI_TASK"
id = "491136304029932"
max_concurrent_runs = 1
max_retries = 0
min_retry_interval_millis = 0
name = "Routing Job for node_types (routing_job_node_types)"
retry_on_timeout = false
tags = {
"clusterIdentifier" = "routing_job"
"clusterOwner" = "eng-foresight-team"
}
~ timeout_seconds = 0 -> 18000
....
}
timeout_seconds
is the only diff for this resource, and shows up every time even though it's already been applied previously.