Grouping tasks to run on the same VM instance
I am submitting task lists with more than 1K elements, with each individual task in the task list taking 1-2 minutes to run. That works fine, however it can be fairly inefficient given that VM orchestration time, pulling container images from private repo etc all consume a large fraction of time and bandwidth relative to individual task run time.
I wanted to know if there is a way I can request dsub to keep reusing the same VM / image and execute groups of 10 or more tasks before releasing the VM. SGE has a task grouping feature for large, short running array jobs. Could we do something similar with dsub please?
Example requested usage:
dsub \
...
--tasks /tmp/dsub/tmp3prgdww1_job/tasks.tsv \ # very big array job with 1-2 minute run times per task
--task-grouping 20 \ # reuse same VM for 20 tasks
--max-concurrency 100 \ # do not have more than 100 VMs
--preemptible 3 \
--retries 3 \
--min-ram xxx \
--min-cores xxx \
--timeout 1h \
--wait
I am in a similar situation!
Would be very nice...
@RiverShah, @gsneha26, @slagelwa: Maybe I'm thinking this is too obvious, but why don't you combine groups of 20 files into a self-indexable file as one input file/output file, and let the script loop through it and select the correct index as necessary.
Hope it helps, ~p