dbx
dbx copied to clipboard
whether to drop `policy_id` from the workflow definition, it might break workflow pause / resuming
Expected Behavior
When resuming a paused workflow, it should not raise any error.
Current Behavior
Following error is raised:
PS, for a workflow never paused, there's no such error.
Steps to Reproduce (for bugs)
- in the deployment file, we reference the cluster policy for e.g
policy_a, inside this policy, there's a spark env var for e.g.env_a=one - deploy the workflow
- pause the worflow
- update the
policy_awith a new value for the same env var,env_a=two - resume the workflow, then the error is raised, this is because the policy_id is referenced in the workflow final json definition, and upon workflow resume, there's a validation.
Same thing, if we update the policy, the workflow can not be paused neither for the same error.
Context
From my understanding, databricks cannot enforce the cluster policy, and it depends on users' willing to use it or not. We use it to inherit and validate some settings from the policy during the deploymen time.
So, to bypass the issue, maybe we can drop the key policy_id from the workflow definition, then there wont be such validation. Otherwise it will be quite difficult to update cluster policy.
Our current workaround is to redeploy the workflow instead of resuming it, because the redeploy will use the new settings.
Your Environment
- dbx version used: v0.8.7
- Databricks Runtime version: 10.4