Cook
Cook copied to clipboard
Fair job scheduler on Kubernetes and Mesos for batch workloads and Spark
Is there a reason Cook assumes that end users have an accompanying posix user which are used on the slaves? For me this requirement renders Cook unusable.
@tnachen pointed https://github.com/apache/spark/blob/3c0156899dc1ec1f7dfe6d7c8af47fa6dc7d00bf/core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala out to me. We could implement this for Cook to simplify submitting prod jobs to Spark via Cook.
Besides setting the CPUs and Memory for each executor, we should be able to specify additional URIs or environment variables to retrieve for the executor, and the min threshold of...
We should have a table documenting all the properties that you can configure for the Cook spark binding.
We need to test against kerberized hadoop to ensure that this isn't needed--it was added as a kerberos support hack, but its time might've expired.
I would like to be able to convey to users how many resources they could expect to be able to run at a given time. While the "share" concept (https://github.com/twosigma/Cook/blob/master/scheduler/src/cook/mesos/share.clj)...
This will give us an idea of how long it should take to start some number of jobs, of various sizes. The motivation is to understand how long it _should_...
This is a discussion for the new Cook Scheduler feature called a "pool". # Motivation We want to support scheduling jobs on heterogeneous clusters (e.g. some preemptible machines, some non-preemptible...
Currently, we have logic to prevent scheduling jobs without GPUs on GPU-enabled hosts to prevent tying up the non-GPU resources on the host. It would be good to make that...