cli
cli copied to clipboard
auto:latest-lts spark_version support
Describe the issue
I am trying to deploy a DAB that creates a new job cluster using the policy id for the Job Compute policy. The Job Compute policy sets this value for the spark_version:
"spark_version": {
"type": "unlimited",
"defaultValue": "auto:latest-lts"
},
I want to use the latest-lts spark version for my jobs if possible and not have to specify the exact version in the DAB.
Setting spark_version: "auto:latest-lts"
in my DAB does not work and I get the following error: "INVALID_PARAMETER_VALUE: Invalid spark version auto:latest-lts." I would expect the resulting bundle.tf.json that has a line that looks similar to data.databricks_spark_version.latest.id
, using the databrick_spark_version Terraform resource in order for this to work correctly.
Omitting the spark_version in my DAB, produces a bundle.tf.json with an empty string: "spark_version": "",
and I get a similar error: "INVALID_PARAMETER_VALUE: Invalid spark version ."
Are there plans for the CLI to support this use case?
Configuration
bundle.yml:
bundle:
name: Test
sync:
include:
- src/*.py
variables:
cluster_policy_id:
description: "The cluster policy used to create the cluster for the job."
resources:
jobs:
Test:
name: _Test
job_clusters:
- job_cluster_key: test-cluster
new_cluster:
policy_id: ${var.cluster_policy_id}
apply_policy_default_values: true
node_type_id: Standard_D8ads_v5
num_workers: 4
spark_version: "auto:latest-lts"
tasks:
- task_key: _Test
job_cluster_key: test-cluster
notebook_task:
notebook_path: "./src/test.py"
Steps to reproduce the behavior
Please list the steps required to reproduce the issue, for example:
- Run
databricks bundle deploy --var "cluster_policy_id=<job compute policy id>"
- See error ""INVALID_PARAMETER_VALUE: Invalid spark version auto:latest-lts."
Expected Behavior
DAB should deploy to Databricks using the LTS spark version.
Actual Behavior
Clear and concise description of what actually happened
OS and CLI version
Windows 10 Databricks CLI v0.211.0
Is this a regression?
No
Debug Logs
JFYI This has to be addressed in Go SDK / API definition where spark version is defined as always required field https://github.com/databricks/databricks-sdk-go/blob/a823ca32fc4199d8cf2269b78cfe89331b4b688a/service/compute/model.go#L1544-L1547
cc @mgyucht
+1 and bump, this would help match my infra currently being maintained by terraform.
Currently we need to be careful when re-deploying terraform as it will update the spark version of our instance pool and subsequently all the jobs will fail unless they are redeployed with the spark version updated in the DABs.