databricks-cli
databricks-cli copied to clipboard
This is more of a question Can databricks-cli be used to run a notebook
I have a python script that pulls a ton of data into s3 and I created a notebook in databricks to turn it into a parquet file. I was wondering if I can use the cli to run a job after i finish pulling in all my data? Thanks
Hi Otis,
The easiest way to do this is to click on the Jobs https://docs.databricks.com/user-guide/jobs.html#create-a-job tab in Databricks and define a job that runs the notebook. Once you have the job id, you can then run it using the jobs feature https://docs.databricks.com/user-guide/dev-tools/databricks-cli.html#job-cli of the Databricks CLI:
databricks jobs run-now --job-id 55
Cheers, Doug
On Fri, May 11, 2018 at 11:40 AM, alienpepsiman [email protected] wrote:
I have a python script that pulls a ton of data into s3 and I created a notebook in databricks to turn it into a parquet file. I was wondering if I can use the cli to run a job after i finish pulling in all my data? Thanks
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/databricks/databricks-cli/issues/122, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS5napHgaxeY6vuHuEuTlWytqXxvs2xks5txdstgaJpZM4T75DQ .
--
Doug Bateman
Director of Training and Education
Databricks Inc.
925.444.5893 <+19254445893>
databricks.com
suppose --job-id 55 runs a .py file that contains 'print' statements? how can I display those to my console when I run the command <databricks jobs run-now --job-id 55>? some sort of 'follow' mode.
It would be awesome the CLI had a synchronous method for executing notebooks. Right now, we have to poll the job to see when it's done. Even if the CLI returned only the notebook exit value at the end, it would be very useful when orchestrating databricks operations.
Might be fixed by #455