databricks-cli icon indicating copy to clipboard operation
databricks-cli copied to clipboard

Misalignment between Databricks CLI and Databricks APIs - importing generic file to workspace

Open luigibrancati opened this issue 2 years ago • 1 comments

We use Databricks on Azure and have a few init scripts that are automatically deployed using Azure releases.

Following this announcement about vulnerabilities in init scripts, we decided to move our init scripts (bash scripts) from the DBFS to the Workspace. While doing so, we encountered a few issues:

  • There's no task on Azure releases to deploy anything but notebooks to the Databricks Workspace
  • The Databricks CLI command databricks workspace import can import only notebooks, since it requires the language option

We solved this falling back to using curl and directly calling the Databricks APIs. The API requires just a base64 string and the language field is optional.

Example code

# script.sh - Example
#!/bin/bash
pip install numpy

# Databricks CLI command - doesn't work
databricks workspace import ./script.sh /InitScripts/script.sh --language PYTHON --profile AZDO --overwrite

# Databricks API - works
export encoded=$(base64 ./script.sh -w 0)
curl --location '<workspace>/api/2.0/workspace/import' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <token>' \
--data '{"format": "AUTO", "path": "/InitScripts/script.sh", "content": "'$encoded'", "overwrite": "true"}'

luigibrancati avatar Jun 07 '23 15:06 luigibrancati

I think I can open a PR if this isn't intentional

luigibrancati avatar Jun 07 '23 15:06 luigibrancati