databricks-sdk-py
databricks-sdk-py copied to clipboard
[ISSUE] databricks sdk jobs. how to create dependency task /lineage using python
How to create dependency jobs / lineage using databricks sdk. I found documentation for single job creation.
created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
tasks=[
jobs.Task(description="test",
existing_cluster_id=cluster_id,
notebook_task=jobs.NotebookTask(notebook_path="test_run"),
task_key="test",
timeout_seconds=0)
Lets say I have main notebook within the notebook creating a job test and passing "test_run" notebook to trigger. I want to run test_run notebook with different paremeter. How to create lineage using sdk python. ? Could please help to share any references I couldn't find ?
Hi @shivatharun, the lineage isn't supported in the SDK currently, however you could update the job with different parameters for example: https://github.com/databricks/databricks-sdk-py/blob/main/examples/jobs/update_jobs_api_full_integration.py where you could use a different JobSetting, does this seem to work for your use case?
Hi @tanmay-db - May I know how tasks can run parallel within job, without any dependency, is there any limitation number ?
created_job = w.jobs.create(name=f'sdk-{time.time_ns()}',
tasks=[ task1,task2,task3,........]))