terraform-provider-databricks icon indicating copy to clipboard operation
terraform-provider-databricks copied to clipboard

[FEATURE] Provide more information regarding misconfiguration in the `plan` command

Open codebydant opened this issue 2 years ago • 0 comments

Hi team,

Greetings.

There is a situation where we can have a successful plan, but the apply command can fail because of a misconfiguration regarding RBAC, job or cluster definitions.

Use-cases

  • Cluster parameters misconfiguration

The plan and the apply are successful, but the created resource has some errors.

For example, we can have a misconfiguration for a cluster and we don't know until the resource is created

Screenshot from 2023-09-08 10-09-38

  • Job cluster ID is not assigned to a task

We try to assign a task in a databricks_job to an unexistent job cluster

➜ terraform apply plan.out    
Error: cannot create job: Job cluster 'my-job-cluster' is not defined in field 'job_clusters'.

  with module.my-job-workflow.databricks_job.this,
  on .terraform/modules/my-job-workflow/infrastructure-modules/databricks/main.tf line 5, in resource "databricks_job" "this":
   5: resource "databricks_job" "this" {
  • Combine num_workers and autoscaling min_workers and max_workers

We will have a changing terraform state if we use both parameter num_workers and autoscaling. The documentation does not mention that these parameters must be mutually exclusive and that we should not use both at the same time.

https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster#fixed-size-or-autoscaling-cluster

  • Task duplicated plan is successful, but the apply command fails when a duplicated task is defined in the terraform file.
Error: cannot create job: Duplicate task keys: task1 found, each task key has to be unique.

  with module.mymodule.databricks_job.this,
  on .terraform/modules/monitoring/infrastructure-modules/databricks/workflows/databricks-job/main.tf line 5, in resource databricks_job this:
   5: resource databricks_job this {
  • RBAC misconfiguration: user does not exist

Setting the permissions for a user that does not exist in the workspace. We don't know this error when running the plan, but instead when running the apply command

 ➜ terraform apply plan.out    
module.my-job-workflow.databricks_permissions.job_cluster_permissions[0]: Modifying... [id=/jobs/608751347263651]
╷
│ Error: cannot update permissions: Principal: UserName(MYUSER@MYEMAIL) does not exist
│ 
│   with module.my-job-workflow.databricks_permissions.job_cluster_permissions[0],
│   on databricks-job/main.tf line 128, in resource "databricks_permissions" "job_cluster_permissions":
│  128: resource "databricks_permissions" "job_cluster_permissions" {
  • RBAC misconfiguration: decrease admin privileges

Setting the permissions for an admin in the databricks job workflow with CAN_MANAGE, CAN_VIEW, etc will cause an error during apply and not when running the plan command

 ➜ terraform apply plan.out                                             
module.my-job-workflow.databricks_job.this: Creating...
module.my-job-workflow.databricks_job.this: Creation complete after 1s [id=106404155492941]
module.my-job-workflow.databricks_permissions.job_cluster_permissions[0]: Creating...
╷
│ Error: cannot create permissions: it is not possible to decrease administrative permissions for the current user: MYUSER@MYEMAIL
│ 
│   with module.my-job-workflow.databricks_permissions.job_cluster_permissions[0],
│   on databricks-job/main.tf line 128, in resource "databricks_permissions" "job_cluster_permissions":
│  128: resource "databricks_permissions" "job_cluster_permissions" {

We will have an error for this (RBAC) configuration only when running apply command and not when running plan

Screenshot from 2023-09-09 07-09-25

  • cron expression misconfiguration
Error: cannot create job: Invalid quartz_cron_expression: 58 0 1 0 * * ?. Databricks uses Quartz cron syntax, which is different from the standard cron syntax. See https://docs.microsoft.com/azure/databricks/jobs#schedule-a-job  for more details.

  with module.my_module.databricks_job.this,
  on .terraform/modules/my_module/infrastructure-modules/databricks/workflows/databricks-job/main.tf line 5, in resource databricks_job this:
   5: resource databricks_job this {

Attempted Solutions

  • Implement a rollback or dry-run with terraform

Proposal

Provide all the information, failures, and misconfiguration when running the plan execution, and make sure the plan fails if there is some kind of misconfiguration regarding RBAC, cluster parameters, etc. If the plan is successful, then the apply command will be successful as well.

References

  • https://github.com/hashicorp/terraform/issues/33397

codebydant avatar Sep 09 '23 12:09 codebydant