databricks-cli icon indicating copy to clipboard operation
databricks-cli copied to clipboard

Editing Delta Live table pipeline, cluster "policy_id" not persisted

Open troxil opened this issue 2 years ago • 4 comments

Description

  • When using databricks pipelines edit --pipeline-id "{{ pipeline_id }}" --settings dlt_tables/settings.json I found that if you were to add or change the clusters[].policy_id, then it won't persist and triggers of this pipeline fail.
  • In our case, we have cluster policy requirement in use and fails if it's absent.
  • Manually adjusting in UI to have this setting, works all the time.
  • Per the PipelinesNewCluster model it appears that it is indeed present.
databricks pipelines edit --pipeline-id "{{ pipeline_id }}"  --settings dlt_tables/settings.json
Successfully edited pipeline settings: https://instance.cloud.databricks.com/#joblist/pipelines/{{ pipeline_id }}
databricks pipelines start --pipeline-id "{{ pipeline_id }}"                                 
Started an update {{ pipeline_id }} for pipeline {{ pipeline_update_id }}.
databricks --version
Version 0.17.0

troxil avatar Aug 02 '22 04:08 troxil

Thanks for the bug report. The CLI passes through the cluster objects to the API unmodified.

Does this behave as expected when you initially create the pipeline?

pietern avatar Aug 10 '22 13:08 pietern

@pradeepgv-db @enzam-db Could you take a look at this?

pietern avatar Sep 01 '22 11:09 pietern

@pietern looking into this.

pradeepgv-db avatar Sep 01 '22 18:09 pradeepgv-db

@troxil I tested it locally with following config and was able to see the policy_id getting persisted as part of pipeline configuration. Could you share the clusters config you are using?

"clusters": [
        {
            "label": "default",
            "policy_id": "<policy_id>",
            ..
            ..
        }
    ],

pradeepgv-db avatar Sep 01 '22 22:09 pradeepgv-db