cli icon indicating copy to clipboard operation
cli copied to clipboard

Create Mlflow experiment failes when exists

Open dgarridoa opened this issue 1 year ago • 3 comments
trafficstars

Describe the issue

Hello, in the Databricks Asset Bundle (DAB), you can specify experiments in the resources.experiments section. According to the documentation it uses this endpoint, it creates the experiment if it does not exist, and if it already exists, it throws an error. It does not make sense to have this feature if the deployment will fail the second time I attempt a deployment because the first time creates the experiment and the second time it already exists.

I assume that DAB internally handles this to not throw an exception in this scenario, but sometimes it does. An easy way to reproduce it is to have two bundles that reference the same experiment, deploy the first one, and then the second one. The second one will fail because the experiment already exists. This scenario does not bother me too much, but I would prefer a logic that creates an experiment if it does not exist, regardless of whether the deployment comes from different bundles. However, from time to time, when I deploy the same bundle, it fails with this exception, and to quickly solve this, I destroy the bundle and redeploy.

Configuration

create_experiment/databricks.yml

bundle:
  name: create-experiment

resources:
  experiments:
    mlflow-test-experiment:
      name: /Shared/.bundle/test-experiment
targets:
  dev:
    mode: development
    default: true

create_existing_experiment/databricks.yaml

bundle:
  name: create-existing-experiment

resources:
  experiments:
    mlflow-test-experiment:
      name: /Shared/.bundle/test-experiment
targets:
  dev:
    mode: development
    default: true

Steps to reproduce the behavior

  1. Run databricks bundle deploy in the create_experiment directory.
  2. Run databricks bundle deploy in the create_existing_experiment directory.
  3. See error
Uploading bundle files to /Users/[email protected]/.bundle/create-existing-experiment/dev/files...
Deploying resources...
Updating deployment state...
Error: terraform apply: exit status 1

Error: cannot create mlflow experiment: Node named '[dev diego_garrido_6568] test-experiment' already exists

  with databricks_mlflow_experiment.mlflow-test-experiment,
  on bundle.tf.json line 17, in resource.databricks_mlflow_experiment.mlflow-test-experiment:
  17:       }

Expected Behavior

It should not fail if the experiment already exists; instead, it could display a warning or message.

Actual Behavior

The deployment fails with an error: cannot create MLflow experiment because it already exists.

OS and CLI version

OS: Ubuntu 23.04 x86_64 Databricks CLI v0.219.0

Debug Logs

log.txt

dgarridoa avatar May 05 '24 20:05 dgarridoa

Uniqueness of the resource is tracked at bundle level hence resource with the same name is being created multiple times.

If you need to deploy the same resource from different bundle you can use databricks bundle deployment bind to bind the resource in DABs to the one already created

andrewnester avatar May 14 '24 11:05 andrewnester

Uniqueness of the resource is tracked at bundle level hence resource with the same name is being created multiple times.

If you need to deploy the same resource from different bundle you can use databricks bundle deployment bind to bind the resource in DABs to the one already created

Do you have a reference/doc in how to do that with experiments? I read the doc and just mention how to do it with jobs. I tried using the experiment resource key and name and it says that such resource does not exists.

databricks bundle deployment bind [experiment-resource-key] [experiment-id]

dgarridoa avatar May 14 '24 15:05 dgarridoa

@dgarridoa oh, you're right, experiments are not yet supported for bind command so we need to have this implemented. Marking this as a feature request

andrewnester avatar May 14 '24 17:05 andrewnester

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

github-actions[bot] avatar Feb 10 '25 00:02 github-actions[bot]