dbx icon indicating copy to clipboard operation
dbx copied to clipboard

whether to drop `policy_id` from the workflow definition, it might break workflow pause / resuming

Open copdips opened this issue 2 years ago • 0 comments
trafficstars

Expected Behavior

When resuming a paused workflow, it should not raise any error.

Current Behavior

Following error is raised:

image

PS, for a workflow never paused, there's no such error.

Steps to Reproduce (for bugs)

  1. in the deployment file, we reference the cluster policy for e.g policy_a, inside this policy, there's a spark env var for e.g. env_a=one
  2. deploy the workflow
  3. pause the worflow
  4. update the policy_a with a new value for the same env var, env_a=two
  5. resume the workflow, then the error is raised, this is because the policy_id is referenced in the workflow final json definition, and upon workflow resume, there's a validation.

Same thing, if we update the policy, the workflow can not be paused neither for the same error.

Context

From my understanding, databricks cannot enforce the cluster policy, and it depends on users' willing to use it or not. We use it to inherit and validate some settings from the policy during the deploymen time. So, to bypass the issue, maybe we can drop the key policy_id from the workflow definition, then there wont be such validation. Otherwise it will be quite difficult to update cluster policy. Our current workaround is to redeploy the workflow instead of resuming it, because the redeploy will use the new settings.

Your Environment

  • dbx version used: v0.8.7
  • Databricks Runtime version: 10.4

copdips avatar Jan 05 '23 10:01 copdips