dbx icon indicating copy to clipboard operation
dbx copied to clipboard

pydantic error on tutorial https://docs.databricks.com/dev-tools/dbx.html#create-minimal-project for python fails

Open amitca71 opened this issue 2 years ago • 2 comments
trafficstars

Expected Behavior

flow deployed on databricks cluster

Current Behavior

following error: [Errno 2] No such file or directory: '/Users/xxxxx/dbx/dbx-demo/pydantic/main.py'

(dbx-demo) (base) ➜ dbx-demo dbx execute

[dbx][2022-11-30 17:24:36.691] 🔎 Deployment file is not provided, searching in the conf directory [dbx][2022-11-30 17:24:36.694] 💡 Auto-discovery found deployment file conf/deployment.yaml [dbx][2022-11-30 17:24:36.695] 🆗 Deployment file conf/deployment.yaml exists and will be used for deployment ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /Users/amitca/.local/share/virtualenvs/dbx-demo-T1WM8LGQ/lib/python3.9/site-packages/dbx/command │ │ s/execute.py:71 in execute │ │ │ │ 68 │ parameters: Optional[str] = EXECUTE_PARAMETERS_OPTION, │ │ 69 │ debug: Optional[bool] = DEBUG_OPTION, # noqa │ │ 70 ): │ │ ❱ 71 │ api_client = prepare_environment(environment) │ │ 72 │ cluster_controller = ClusterController(api_client, cluster_name=cluster_name, cluste │ │ 73 │ │ │ 74 │ workflow_name = workflow_name if workflow_name else job_name │ │ │ │ /Users/amitca/.local/share/virtualenvs/dbx-demo-T1WM8LGQ/lib/python3.9/site-packages/dbx/utils/c │ │ ommon.py:29 in prepare_environment │ │ │ │ 26 │ │ 27 │ │ 28 def prepare_environment(env_name: str) -> ApiClient: │ │ ❱ 29 │ info = ProjectConfigurationManager().get(env_name) │ │ 30 │ transfer_profile_name(info) │ │ 31 │ MlflowStorageConfigurationManager.prepare(info) │ │ 32 │ return DatabricksClientProvider.get_v2_client() │ │ │ │ /Users/amitca/.local/share/virtualenvs/dbx-demo-T1WM8LGQ/lib/python3.9/site-packages/dbx/api/con │ │ figure.py:88 in get │ │ │ │ 85 │ │ self._manager.create_or_update(environment_name, environment_info) │ │ 86 │ │ │ 87 │ def get(self, environment_name: str) -> EnvironmentInfo: │ │ ❱ 88 │ │ return self._manager.get(environment_name) │ │ 89 │ │ │ 90 │ def enable_jinja_support(self): │ │ 91 │ │ self._manager.enable_jinja_support() │ │ │ │ /Users/amitca/.local/share/virtualenvs/dbx-demo-T1WM8LGQ/lib/python3.9/site-packages/dbx/api/con │ │ figure.py:28 in get │ │ │ │ 25 │ │ self.create(name, environment_info) │ │ 26 │ │ │ 27 │ def get(self, name: str) -> EnvironmentInfo: │ │ ❱ 28 │ │ _typed = self._read_typed() │ │ 29 │ │ return _typed.get_environment(name) │ │ 30 │ │ │ 31 │ def create(self, name: str, environment_info: EnvironmentInfo): │ │ │ │ /Users/amitca/.local/share/virtualenvs/dbx-demo-T1WM8LGQ/lib/python3.9/site-packages/dbx/api/con │ │ figure.py:20 in _read_typed │ │ │ │ 17 │ │ │ ) │ │ 18 │ │ │ │ 19 │ │ _content = JsonUtils.read(self._file) │ │ ❱ 20 │ │ _typed = ProjectInfo(**_content) │ │ 21 │ │ return _typed │ │ 22 │ │ │ 23 │ def update(self, name: str, environment_info: EnvironmentInfo): │ │ │ │ /Users/amitca/dbx/dbx-demo/pydantic/main.py:342 in pydantic.main.BaseModel.init │ │ │ │ [Errno 2] No such file or directory: '/Users/amitca/dbx/dbx-demo/pydantic/main.py' │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ ValidationError: 4 validation errors for ProjectInfo environments -> default -> profile field required (type=value_error.missing) environments -> default -> profile field required (type=value_error.missing) environments -> default -> workspace_dir field required (type=value_error.missing) environments -> default -> artifact_location field required (type=value_error.missing)

Steps to Reproduce (for bugs)

follow: https://docs.databricks.com/dev-tools/dbx.html#create-minimal-project (for pthon):

  1. mkdir dbx-demo
  2. cd dbx-demo
  3. copy sample file to dir
  4. export DATABRICKS_HOST=https://xxxxxx.cloud.databricks.com export DATABRICKS_TOKEN=xxxxxx
  5. pipenv --python 3.9.7
  6. dbx configure --environment default (no profile as we use env vars)
  7. mkdir conf 8.create file conf/deployment.yaml, with the following content: build: no_build: true environments: default: workflows:
    • name: "dbx-demo-job" spark_python_task: python_file: "file://dbx-demo-job.py"

Context

  1. dbx execute --cluster-id=https://xxxx.cloud.databricks.com/?o=4313615198913373#setting/clusters/xxxx dbx-demo-job --no-package

Your Environment

Python 3.9.7

  • dbx version used: Databricks eXtensions aka dbx, version ~> 0.8.7
  • Databricks Runtime version: 11.3 LTS ML (includes Apache Spark 3.3.0, Scala 2.12)

amitca71 avatar Nov 30 '22 16:11 amitca71

you're using a wrong configuration structure in the .dbx/project.json file. Please take a look at the correct one as per docs - does it fit your structure?

Screenshot 2022-11-30 at 19 27 02

renardeinside avatar Nov 30 '22 18:11 renardeinside

i use the one from the tutorial (created automatically by command line call: dbx configure --profile DEFAULT --environment default ), it looks similiar (i also try with a profile, same results): { "environments": { "default": { "storage_type": "mlflow", "properties": { "workspace_directory": "/Shared/dbx/projects/dbx_databricks", "artifact_location": "dbfs:/dbx/dbx_databricks" } } } }

amitca71 avatar Dec 01 '22 06:12 amitca71