Modify workerTemplate for particular submissions
Currently we create a single job template to create all workers.
class DRMAACluster(object):
def __init__(self, **kwargs):
self.local_cluster = LocalCluster(n_workers=0, **kwargs)
self.session = drmaa.Session()
self.session.initialize()
self.worker_template = self.session.createJobTemplate()
self.worker_template.remoteCommand = os.path.join(sys.exec_prefix, 'bin', 'dask-worker')
self.worker_template.jobName = 'dask-worker'
self.worker_template.args = ['%s:%d' % (socket.gethostname(), self.local_cluster.scheduler.port)]
self.worker_template.outputPath = ':/%s/out' % os.getcwd()
self.worker_template.errorPath = ':/%s/err' % os.getcwd()
self.worker_template.workingDirectory = os.getcwd()
self.workers = set()
This job template is easily accessible and so provides a nice and familiar release valve for expert users of python-drmaa who want to customize their setup.
However as we submit slightly different kinds of worker jobs we'll want to modify this template a bit, by specifying extra native specifications and by adding extra keywords to the dask-worker command.
def start_workers(self, n=1, .. want to add extra arguments here ...):
and safely create new worker template here for submission
I'm unable to find a way to copy-and-append an existing job template, which leads me to instead consider storing all of the job template attributes (args, native spec, command, error path, etc.) bare, instead of within a job template object. Then we'll create a new job template for each call to start_workers.
class DRMAACluster(object):
def __init__(self, **kwargs):
self.local_cluster = LocalCluster(n_workers=0, **kwargs)
self.session = drmaa.Session()
self.session.initialize()
# self.worker_template = self.session.createJobTemplate()
self.remoteCommand = os.path.join(sys.exec_prefix, 'bin', 'dask-worker')
self.jobName = 'dask-worker'
self.args = ['%s:%d' % (socket.gethostname(), self.local_cluster.scheduler.port)]
self.outputPath = ':/%s/out' % os.getcwd()
self.errorPath = ':/%s/err' % os.getcwd()
self.workingDirectory = os.getcwd()
self.workers = set()
def start_workers(self, args, ...)
wt = self.session.createJobTemplate()
wt.args = self.args + args
...
Does anyone see a better way?
cc @davidr
@mrocklin I think that keeping the job template attributes inside python makes more sense since the python drmaa library does a lot of C-drmaa wrapping. If for some reason, the drmaa session dies and has to be reinitialized, the drmaa job templates might be lost (speculation).
Yeah, in #10 we now generate a new job template every time we launch new workers.