databricks-sdk-py
databricks-sdk-py copied to clipboard
[ISSUE] Unable to use JobCluster in job submitted via client.jobs.submit
Description
Perhaps I am missing something in my implementation, but when using client.jobs.submit I can only see a way to create a job's cluster by passing new_cluster to SubmitTask. However, when I do that, a new cluster is created for every single task.
I have noticed that I can create a persistent job first using a CreateJob definition with client.jobs.create. This does allow for creating one job-cluster with a key that can be referenced by all tasks. However, I do not see a way to create an ephemeral job in such a way. I would expect the tasks in the client.jobs.submit command to also be able to refer to a job-cluster key that is created somewhere.
So I am not sure if this is a bug. It may be a missing feature in the Databricks API.
Reproduction
import time
from databricks.sdk.service import jobs
from databricks.sdk import WorkspaceClient
client = WorkscpaceClient()
notebook_path = "some_path"
new_cluster = compute.ClusterSpec(
driver_node_type_id="Standard_DS3_v2",
node_type_id="Standard_DS3_v2",
num_workers=1,
spark_version="14.3.x-scala2.12",
)
tasks = [
jobs.SubmitTask(
task_key="tester1",
new_cluster=new_cluster,
notebook_task=jobs.NotebookTask(
notebook_path=notebook_path
)
),
jobs.SubmitTask(
task_key="tester2",
new_cluster=new_cluster,
notebook_task=jobs.NotebookTask(
notebook_path=notebook_path
)
)
]
client.jobs.submit(
run_name=f'py-sdk-run-{time.time()}',
tasks=tasks
)
This will create two clusters: "tester1 cluster" and "tester2 cluster"
Expected behavior I expect one cluster that is used by all tasks, unless I somehow specify different in one of the tasks.
Is it a regression?
Debug Logs
Other Information
- Version: 0.29.0
Additional context
I went recently through the same pain. Unfortunately, even though this endpoint is more flexible, it's lacking a lot of features compared to the one that creates an actual workflow.
I completely dropped the idea to use the "submit" endpoint.
The other one that creates a job works like a charm and even though I think rate limiting is a bit stricter for that one, the SDK has ways to go around that and retry requests.
P.S: I personally don't think there will ever be feature parity between the 2 endpoints.
Hi @jmeidam, Thanks for reporting the issue. The best way to resolve API-related queries is to contact your Databricks POC. They can help you resolve this kind of issue by connecting with the API owner.