databricks-sdk-py icon indicating copy to clipboard operation
databricks-sdk-py copied to clipboard

[ISSUE] databricks sdk depends_on =['task1'] getting an error attributeerror: str has no attribute 'as_dict'

Open shivatharun opened this issue 1 year ago • 6 comments
trafficstars

Below is the code its breaking near depends_on =['task1'] which is task2 dependent on completion of task1 .

error: attribute error : str object has no attribute as_dict. Please correct me for to include dependency between tasks

import os
import time

from databricks.sdk import WorkspaceClient
from databricks.sdk.service import jobs

w = WorkspaceClient()

notebook_path = f'/Users/{w.current_user.me().user_name}/sdk-{time.time_ns()}'

cluster_id = w.clusters.ensure_cluster_is_running(
    os.environ["DATABRICKS_CLUSTER_ID"]) and os.environ["DATABRICKS_CLUSTER_ID"]

created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
                            tasks=[
                                jobs.Task(description="test",
                                          existing_cluster_id=cluster_id,
                                          notebook_task=jobs.NotebookTask(notebook_path=notebook_path),
                                          task_key="task1",
                                          timeout_seconds=0),
                                  jobs.Task(description="test",
                                                                            existing_cluster_id=cluster_id,
                                                                            notebook_task=jobs.NotebookTask(notebook_path=notebook_path),
                                                                            task_key="task2",
                                                                             depends_on=['task1']
                                                                            timeout_seconds=0)
                            ])


shivatharun avatar Jan 11 '24 17:01 shivatharun

Hi @shivatharun. The depends_on field for jobs.Task is expecting a list of type TaskDependency (Docs)

jobs.TaskDependency(task_key="task1")

Can you give that a try?:

import time

from databricks.sdk import WorkspaceClient
from databricks.sdk.service import jobs

w = WorkspaceClient()

notebook_path = f"/Users/{w.current_user.me().user_name}/sdk-{time.time_ns()}"

# waiting for the cluster to start
w.clusters.ensure_cluster_is_running(os.environ["DATABRICKS_CLUSTER_ID"])

cluster_id = os.environ["DATABRICKS_CLUSTER_ID"]

first_task = jobs.Task(
    description="test",
    existing_cluster_id=cluster_id,
    notebook_task=jobs.NotebookTask(notebook_path=notebook_path),
    task_key="task1",
    timeout_seconds=0,
)

second_task = jobs.Task(
    description="test",
    existing_cluster_id=cluster_id,
    notebook_task=jobs.NotebookTask(notebook_path=notebook_path),
    task_key="task2",
    depends_on=[jobs.TaskDependency(task_key="task1")],
    timeout_seconds=0,
)

created_job = w.jobs.create(
    name=f"sdk-{time.time_ns()}", tasks=[first_task, second_task])

kimberlyma avatar Jan 13 '24 00:01 kimberlyma

Hi @shivatharun, thanks for reaching out. Can you please tell if the solution proposed by @kimberlyma is working?

tanmay-db avatar Jan 15 '24 14:01 tanmay-db

Hi @tanmay-db @mgyucht , Actullay there is no TaskDependency method , even I upgraded the version of databricks sdk instead showing /referencing TaskDependenciesItem however if pass
jobs.TaskDependenciesItem(task_key="task1") then returning attributeerror : module 'databricks.sdk.service.jobs' has no attribute TaskDependenciesItem

shivatharun avatar Jan 17 '24 13:01 shivatharun

Hi @shivatharun - to clarify you upgraded the SDK in your development environment? Are you getting the error in the same environment? Can you verify your version of the sdk pip show databricks-sdk. pip install databricks-sdk --upgrade will get you the latest. It should be TaskDependency instead of TaskDependenciesItem since the release for v.0.1.12 #205

kimberlyma avatar Jan 18 '24 23:01 kimberlyma

Thanks for the solution provided. It worked for me smoothly !!

sreekanth9999 avatar May 10 '24 20:05 sreekanth9999

Thanks for the solution, it works for single dependency. How to specify multiple tasks dependencies and specify options for "Run if dependencies"

dkipnis avatar May 29 '24 02:05 dkipnis