[ISSUE] How to troubleshoot when a job creation failed

Open huydinhle opened this issue 1 year ago • 1 comments

Description How to find out what is the error databricks giving us whenever we failed to create a job with python-sdk

Reproduction A minimal code sample demonstrating the bug.

import os
import time

from databricks.sdk import WorkspaceClient
from databricks.sdk.service import jobs

w = WorkspaceClient()

notebook_path = f'/Users/{w.current_user.me().user_name}/sdk-{time.time_ns()}'

cluster_id = w.clusters.ensure_cluster_is_running(
    os.environ["DATABRICKS_CLUSTER_ID"]) and os.environ["DATABRICKS_CLUSTER_ID"]

created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
                            tasks=[
                                jobs.Task(description="test",
                                          existing_cluster_id=cluster_id,
                                          notebook_task=jobs.NotebookTask(notebook_path=notebook_path),
                                          task_key="test",
                                          timeout_seconds=0)
                            ])

# cleanup
w.jobs.delete(job_id=created_job.job_id)

Jun 06 '24 19:06 huydinhle

Get the logger and set it to info/add handler with info filter?

import logging, sys
logging.basicConfig(stream=sys.stderr,
                    level=logging.INFO,
                    format='%(asctime)s [%(name)s][%(levelname)s] %(message)s')
logging.getLogger('databricks.sdk').setLevel(logging.DEBUG)
from databricks.sdk import WorkspaceClient
w = WorkspaceClient(debug_truncate_bytes=1024, debug_headers=False)
for cluster in w.clusters.list():
    logging.info(f'Found cluster: {cluster.cluster_name}')

https://databricks-sdk-py.readthedocs.io/en/latest/logging.html

Jun 17 '24 08:06 detemegandy