Service Principal for bundle validate src error
Describe the issue
I want to use a Service Principal I configured for me workspace to do the actual validation and deployment from a serverless web terminal.
Configuration
Please provide a minimal reproducible configuration for the issue
Steps to reproduce the behavior
Please list the steps required to reproduce the issue, for example:
- Create a simple bundle containing only a job (under resources) with one task which calls a hello world notebook (under src)
- Open web terminal and switch to existing bundle folder of current logged in user
- Create the
~/.databrickscfgfile with profiledab_spto store the azure service principal creds. - Set
export DATABRICKS_TOKEN=to force usage of creds from cfg file - Run
databricks bundle validate -t dev -p dab_sp - See error
Expected Behavior
I would expect that the call would use the given service principal to do the validation and deployment
Actual Behavior
Getting this error message:
Error: notebook src/hello_notebook.ipynb not found
Name: dab_test
Target: dev
Workspace:
Host: https://adb-xxxxx.1.azuredatabricks.net
User: xxxx-xxxx-xxx-xxxx-xxxxx
Path: /Workspace/Users/xxxx/.bundle/gen_dab_test/dev
OS and CLI version
Databricks CLI v0.270.0
Is this a regression?
Did this work in a previous version of the CLI? If so, which versions did you try?
Debug Logs
see attachment
Thanks for reporting the issue.
Can you confirm whether the notebook's export format is .ipynb or .py?
If you refer to src/hello_notebook.ipynb then the export format needs to be .ipynb, even if it doesn't show the file extension in the workspace UI.
Format is .ipynb which is not displayed in the UI but can be checked in the web terminal.
BTW: using my current user and not to use the service principle results in successful validation.
I suspect that the problem might be because notebook files are present in the Workspace folder for the user with whom you log in to DAB in UI and not SP, because the bundle was never deployed for this SP
@fjakobs @ilyakuz-db do you know more about it?
@andrewnester : this is why I set the workspace rootfolder in my databricks.yml to the current user folder. Btw: and why does it find the resources folder and read the job.yml file?
@bicaluv can you share your databricks.yml configuration here? Likely you have an include section which picks up other configuration files.
As to the original issue, does the SP you use have permissions to access your current user folder?
As to the original issue, does the SP you use have permissions to access your current user folder?
Where and how can I check this? using ls -l does not show any specific limits.
databricks.yml:
# This is a Databricks asset bundle definition for datamind-allspark.
# The Databricks extension requires databricks.yml configuration file.
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
bundle:
name: gen_dab_test
include:
- resources/*.yml
workspace:
root_path: /Workspace/Users/[email protected]/.bundle/${bundle.name}/${bundle.target}
targets:
dev:
default: true
workspace:
host: https://adb-123456.azuredatabricks.net
run_as:
service_principal_name: cbdb0852-xxx-1234-8616-xxxxx
variables:
cluster_policy_id:
default: 30648123456
pause_status:
default: PAUSED
warehouse_id:
default: 76e354d1234567
presets:
tags:
use-case: bicaluv_tests
permissions:
- level: CAN_MANAGE
group_name: Data-Engineers
- level: CAN_MANAGE
service_principal_name: cbdb0852-xxx-43e2-xxx-efb05512345
- level: CAN_MANAGE
user_name: [email protected]
variables:
cluster_policy_id:
default: 306483B21234567
description: The cluster Policy Id, depending on the stage we are working in.
warehouse_id:
default: 0
description: The SQL Warehouse ID to use for interacting with materialzed views.
pause_status:
default: UNPAUSED
description: Should the Job be paused or unpaused?
And the job yml referenced using resources:
resources:
jobs:
test_job:
name: test_job
tasks:
- task_key: notebook_runner
email_notifications:
on_failure:
- [email protected]
notebook_task:
notebook_path: ../src/hello_notebook.ipynb
source: WORKSPACE
notification_settings: {}
run_if: ALL_SUCCESS
webhook_notifications: {}
email_notifications: {}
max_concurrent_runs: 1
performance_target: PERFORMANCE_OPTIMIZED
queue:
enabled: true
tags:
"contact": "[email protected]"
webhook_notifications: {}
This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.