databricks-sdk-py icon indicating copy to clipboard operation
databricks-sdk-py copied to clipboard

[FEATURE] Jobs.create method should also accept a JobDefinition

Open mniehoff opened this issue 1 year ago • 6 comments

Problem Statement The jobs.create() method only accepts the individual settings as parameter. This is especially annoying when having job settings as json/yaml or dict, because they have to be mapped to the parameters

Proposed Solution The jobs.create() method should accept an object of JobDefinition just as the jobs.reset method does.

job_settings = JobSettings.from_dict(job_settings_dict)
w.jobs.create(job_settings)

Additional Context Likely other methods, not only for Jobs would benefit from an easy way to send json/yaml/dicts to the API without mapping it to method parameters.

mniehoff avatar Nov 07 '23 16:11 mniehoff

Hi @mniehoff, thanks for reporting this. For context, the API of the SDK is meant to match the REST API nearly exactly, which you can see here: create a job, reset a job. However, to make this use case easier, our plan is to change the SDK to support plain objects/strings in place of dataclasses/enums. Then, you should be able to do the following:

w.jobs.create(job_settings_dict**)

using kwargs from your settings dictionary for each of the parameters of the create API.

mgyucht avatar Nov 20 '23 10:11 mgyucht

That would be great and i guess also more pythonic than dataclasses :-)

mniehoff avatar Nov 28 '23 14:11 mniehoff

This would be a great feature indeed. A useful intermediate solution would be if the JobSettings' as_dict method accepted a flag allowing it to serialize it not completely but shallowly, just to the point that you can unpack the resulting dictionary into valid input for jobs.create(). Something like

job_settings = JobSettings.from_dict(job_settings_dict)
w.jobs.create(**job_settings.as_dict(serialize_shallow=True))

camilo-s avatar Mar 11 '24 15:03 camilo-s

Is there any update or alternative for this? Currently it stops us from adopting the sdk instead of the deprecated cli, as we have stored job and cluster config in yaml/json files and we would need map each attribute when using the create methods. With the cli we could just do jobs_api.create_job(json=job_definition)

mniehoff avatar Apr 23 '24 08:04 mniehoff

Here's a dirty workaround:

job_settings = JobSettings.from_dict(job_settings_dict)
w.jobs.create(**job_settings.__dict__)

camilo-s avatar Apr 29 '24 17:04 camilo-s

Is there any update on this?

scdmitry avatar May 02 '24 12:05 scdmitry