batchtools
batchtools copied to clipboard
any plans to support dependencies between jobs?
Hi I'm interested in using batchtools but after looking at the documentation I'm not sure if batchtools has support for dependencies between jobs, which is a key feature that I would need. It is documented for SLURM on https://slurm.schedmd.com/job_array.html
e.g
# Wait for entire job array to complete successfully
sbatch --depend=afterok:123 my.job
If batchtools does support dependencies, where are the docs?
If not, how hard would it be to implement?
hello @berndbischl @arfon @timflutre @mllg
well @mllg really should answer here..... but my 2cents:
a) no this is is not supported, maybe you can hack something in, but it its supported in a cool an general way b) it was one of the first general big issues i opened up for batchjobs quite some time ago. this is something that would really take bt to the next level IMHO
but stuff like that is usually not that simple to implement
if some of us are here, can we maybe at least, before we jump to solution specify what we want? how would a cool system for this look like?
As @berndbischl said, it is not yet supported. A simple version would not be too hard to implement. It all depends on the interface you need. What would be relatively easy to write is the following:
- You define jobs as usual with
batchMap()
. - Get the table of all jobs you want to submit, e.g.
ids = findNotSubmitted()
. - Add an integer column
depends.on
. This is either NA (no deps) or a valid job id. Send tosubmitJobs()
. -
submitJobs()
needs to first submit all jobs withdepends.on == NA
. Wait until all these jobs have been submitted to Slurm, as you need the slurm job id as returned bysbatch
in the database. - Adjust the resources to add "depend=afterok:xx" and submit all jobs whose dependencies are already submitted. Repeat until all jobs submitted.
what it make sense - at some point, as this would be more complicated i guess - to look at a combo with drake?
hi @mllg thanks for the idea to use depend=afterok:xx in resources. in fact I could probably do this in the current version of batchtools, as long as I use one register per step, right?
reg1=makeRegistry("~/registry/1")
reg2=makeRegistry("~/registry/2")
batchMap(fun = Step1, 1:10, reg=reg1)
batchMap(fun = Step2, "FOO", reg=reg2)
jobs <- getJobTable(reg=reg1)
chunks <- data.table(jobs, chunk=1)
submitJobs(chunks, resources = list(
walltime = 3600, memory = 1024, ncpus=1, ntasks=1,
chunks.as.arrayjobs=TRUE),
reg=reg1)
jobs.done <- getJobTable()
job.id <- sub("_.*", "", jobs.done$batch.id)[[1]]
submitJobs(resources = list(
walltime = 3600, memory = 1024, ncpus=1, ntasks=1,
afterok=job.id
), reg=reg2)
I added the following line to slurm-simple.tmpl:
<%= if (!is.null(resources$afterok)) paste0("#SBATCH --depend=afterok:", resources$afterok) %>
Do you think that is an OK approach?
For me it seems a bit cumbersome to have to create one registry per step...
Wouldn't it make more sense to explicitly not try this in batchtools
and use a workflow tool for job dependencies, like drake
?
Wouldn't it make more sense to explicitly not try this in
batchtools
and use a workflow tool for job dependencies, likedrake
?
For more complex scenarios and to ensure portability between batch systems: yes. But as outlined above, it is note that difficult to implement. You just need a topo-sort, e.g. from https://github.com/mlr-org/mlr3misc/blob/master/R/topo_sort.R.