databricks-sdk-py icon indicating copy to clipboard operation
databricks-sdk-py copied to clipboard

[ISSUE] Unable to use JobCluster in job submitted via client.jobs.submit

Open jmeidam opened this issue 1 year ago • 2 comments

Description Perhaps I am missing something in my implementation, but when using client.jobs.submit I can only see a way to create a job's cluster by passing new_cluster to SubmitTask. However, when I do that, a new cluster is created for every single task.

I have noticed that I can create a persistent job first using a CreateJob definition with client.jobs.create. This does allow for creating one job-cluster with a key that can be referenced by all tasks. However, I do not see a way to create an ephemeral job in such a way. I would expect the tasks in the client.jobs.submit command to also be able to refer to a job-cluster key that is created somewhere.

So I am not sure if this is a bug. It may be a missing feature in the Databricks API.

Reproduction

import time
from databricks.sdk.service import jobs
from databricks.sdk import WorkspaceClient

client = WorkscpaceClient()
notebook_path = "some_path"

new_cluster = compute.ClusterSpec(
    driver_node_type_id="Standard_DS3_v2",
    node_type_id="Standard_DS3_v2",
    num_workers=1,
    spark_version="14.3.x-scala2.12",
)

tasks = [
    jobs.SubmitTask(
        task_key="tester1",
        new_cluster=new_cluster,
        notebook_task=jobs.NotebookTask(
            notebook_path=notebook_path
        )
    ),
    jobs.SubmitTask(
        task_key="tester2",
        new_cluster=new_cluster,
        notebook_task=jobs.NotebookTask(
            notebook_path=notebook_path
        )
    )
]

client.jobs.submit(
    run_name=f'py-sdk-run-{time.time()}',
    tasks=tasks
)

This will create two clusters: "tester1 cluster" and "tester2 cluster"

Expected behavior I expect one cluster that is used by all tasks, unless I somehow specify different in one of the tasks.

Is it a regression?

Debug Logs

Other Information

  • Version: 0.29.0

Additional context

jmeidam avatar Sep 03 '24 10:09 jmeidam

I went recently through the same pain. Unfortunately, even though this endpoint is more flexible, it's lacking a lot of features compared to the one that creates an actual workflow.

I completely dropped the idea to use the "submit" endpoint.

The other one that creates a job works like a charm and even though I think rate limiting is a bit stricter for that one, the SDK has ways to go around that and retry requests.

P.S: I personally don't think there will ever be feature parity between the 2 endpoints.

calin-hb avatar Sep 18 '24 07:09 calin-hb

Hi @jmeidam, Thanks for reporting the issue. The best way to resolve API-related queries is to contact your Databricks POC. They can help you resolve this kind of issue by connecting with the API owner.

parthban-db avatar Jan 31 '25 17:01 parthban-db