dbx icon indicating copy to clipboard operation
dbx copied to clipboard

Using the same workspace directory between different environments

Open WmWessels opened this issue 1 year ago • 0 comments

Expected Behavior

I would like to create two environments (in .dbx/project.json). Here, I want to have the same workspace directory in both environments, but use different artifact locations.

Current Behavior

When I deploy my python project using dbx in our CICD pipeline, I get an exception. The exception I get is this:

Exception: Required location of experiment /Shared/dbx/ doesn't match the project defined one.

Steps to Reproduce (for bugs)

Create a dbx project. In the project.json, there should be two different environments. The workspace directory should be the same, but the artifact location should be different.

Then, create two deployment files (one for training, one for scoring). In the first deployment file, we create a workflow using the first environment. In the second deployment file, we create a workflow using the second environment.

finally:

  • dbx deploy --deployment-file <deployment_file_train>
  • dbx deploy --deployment-file <deployment_file_score>

Context

We want to version our ML code in production. We currently have a training workflow and a scoring workflow (training workflow stores the trained models, scoring refers to these models). As such, we would like the training workflow and scoring workflow to use the same workspace directory. However, we also want to use different artifact locations, such that we can version our code and not have the training/scoring workflows use the same code version.

How would I need to structure my project.json in order to get this to work?

Your Environment

  • dbx version used: 0.8.17
  • Databricks Runtime version: 12.2

WmWessels avatar Feb 23 '24 08:02 WmWessels