redun icon indicating copy to clipboard operation
redun copied to clipboard

SLURM/HPC executor

Open multimeric opened this issue 2 years ago • 3 comments

From what I'm reading in the docs, the 3 executors are AWS Batch, AWS Glue, and local. However for HPC users it would be helpful to have a dedicated executor that submits tasks to that queueing system. A slightly easier way to do this in Python might be to just make a dask executor, and since dask has implementations for many platforms (e.g. http://jobqueue.dask.org/en/latest/), you kind of get this for free.

multimeric avatar Apr 01 '22 04:04 multimeric

Thanks @multimeric for posting this issue. You are correct that we intend to add additional executors over time and HPC clusters is indeed an important use case. Piggy backing off of Dask to get multiple executor backends at once is a great idea to investigate. Thanks for sharing!

mattrasmus avatar Apr 01 '22 15:04 mattrasmus

Hi @mattrasmus, I would be interested in trying out redun as an alternative for snakemake, but according to the documentation the only viable way to use redun at scale is by running it on AWS. This is a big no-go.

Are there any updates on a SLURM / HPC executor? Is it possible to configure custom executors like e.g. Snakemake allows?

Hoeze avatar May 11 '23 17:05 Hoeze

+1. I'd also recommend considering the use of PSI/J to streamline such an addition: https://github.com/ExaWorks/psij-python. It is a lightweight dependency with a unified interface to various job schedulers, including up-and-coming ones.

Andrew-S-Rosen avatar Sep 05 '23 05:09 Andrew-S-Rosen