Cook
Cook copied to clipboard
Design: Support for Dynamic Executor Allocation in Spark Scheduler
The retry logic in the Coarse Grained Scheduler should support Dynamic Executor Allocation in Spark. That is, if spark.dynamicAllocation.enabled, then Cook will scale number of executors according to:
- spark.dynamicAllocation.initialExecutors
- spark.dynamicAllocation.minExecutors
- spark.dynamicAllocation.maxExecutors
Reference: https://spark.apache.org/docs/1.6.0/job-scheduling.html#dynamic-resource-allocation http://jerryshao.me/architecture/2015/08/22/spark-dynamic-allocation-investigation/
Questions: How does this interact with 'spark.executor.failures' and the failure count logic in CoarseCookSchedulerBackend? It's not obvious how preemption will iteract with this feature w/ respect to user experience. We should test with shuffle server on and use the cook simulator to better understand this.