How to Handle Scheduler Specific Details
In responses to both #3 and #4 it seems like there are a number of useful options that are not explicitly handled by DRMAA. How should we handle this? Some thoughts:
- We pass through options so that people who understand their underlying system aren't throttled by what we provide natively
- We might consider making scheduler-specific subclasses and CLIs like
dask-sgealongsidedask-drmaa
That seems pretty reasonable to me, provided the ability to subclass the subclasses to add site-specific options. (or some other reasonable way to do that).
Both solutions at the same time are good. Every site (and site within site) has a different set of resources and ways to use them. Some places penalize small jobs (i.e. small number of CPUs) or restrict them to only run in a few places, and some give them priority. The users of those systems will know how to game the system, so dask make it easy for them. Making scheduler-specific classes and CLIs also make sense, as not all schedulers have a compatible drmaa interface. Further, some schedulers have interfaces for taking info from external sources (like dask) to optimize how it schedules things.
Summary: Both 1 and 2 are good, depending on who is using.
Hope this makes sense.