aztk
aztk copied to clipboard
AZTK powered by Azure Batch: On-demand, Dockerized, Spark Jobs on Azure
Job UI only runs on the node where the Driver is executed. If this isn't the master node, then when you ssh and try to go to localhost:4040, you get...
Currently, job submission mode doesn't have ability to specify file shares: https://github.com/Azure/aztk/blob/17755e0ffadfca78908c9a4b4c0e7a6e78c2dda3/aztk/spark/client.py#L170
For plugins that run in the Spark container, some client-side validation of toolkit (docker-image) and plugin compatibility would prevent a lot of confusion.
By placing the correct driver, we should be able to read/write data from/to MS SQL server. [https://blogs.msdn.microsoft.com/bigdatasupport/2015/10/22/how-to-allow-spark-to-access-microsoft-sql-server/](url)
There are many errors that could happen when deploying a new cluster that just get silently ignored. We should make sure the node gets into `StartTaskFailed` state if any of...
After submitting a job, I often find myself juggling several commands to get all the info about it (`get`, `list-apps`, `get-app`, `get-app-logs`), so I put together one command that gets...
Since #534, any node can be ssh'ed into. We should allow specifying the node_id in the ssh command. `aztk spark cluster ssh --id {cluster-id} --node-id {node-id}` By default, we still...