miniwdl icon indicating copy to clipboard operation
miniwdl copied to clipboard

miniwdl ext for SLURM

Open rhpvorderman opened this issue 2 years ago • 13 comments

Hi @mlin,

I see there are some projects in the miniwdl-ext repository.

Is there a clearly defined API for miniwdl extensions? I would love to write a miniwdl extension for SLURM. I cannot emphasize this enough: I would love to do this.

Despite Cromwell supporting SLURM, adding support hasn't been easy and there are still some lingering issues. Also getting PRs merged into cromwell takes months as the developer team focuses on other things. I don't blame them for this, but it is rather inconvenient for us. Ping @DavyCats @illusional

Having us at LUMC write a miniwdl extension for SLURM has the following benefits:

  • Separate from MiniWDL codebase, so no responsibility for you.
  • Code maintained by people who know SLURM quite well and care about the HPC world.

I think this would be a win-win for miniwdl and the HPC as it brings HPC to the HPC audience.

Can you give me some pointers to the API? I cannot find it in the docs. I will be able to figure something out myself by dissecting the Amazon plugins of course.

rhpvorderman avatar Jun 24 '22 10:06 rhpvorderman

I'd be super keen to help out building this. Would love one for SGE too :D

illusional avatar Jun 24 '22 12:06 illusional

Hi @rhpvorderman @DavyCats @illusional

Likewise, I'd be delighted to collaborate with you on this, even though I'm short of bandwidth & expertise to lead it personally. We could definitely host it under miniwdl-ext, which just houses miniwdl-related code not formally sponsored by any org, or anywhere else you might see fit.

The backend plugin interface is reasonably designed although prose documentation is lacking (perhaps also something we could work together). The best pointers to triangulate are

  1. miniwdl-aws for the high-level python plugin structure and batch_job.py for the implementation using an NFS share (EFS in that case, but we'd probably approach it similarly for a HPC cluster)
  2. task_container.py is the abstract base class for the container runtime backend, and has some detailed comments about what you're supposed to override
  3. cli_subprocess.py is another implementation underlying the podman/singularity/udocker backends

I think the solution would probably combine some of the shared filesystem bits in (1) with subprocess job submission bits from (3). But I'm not very familiar with how to invoke docker stuff on HPC clusters, so I look forward to learning about that from you.

Happy to get a thread going on #miniwdl slack or set up a telecon, timezones permitting..

mlin avatar Jun 25 '22 05:06 mlin

@mlin How do you feel about the following plan

  • We will set up miniwdl-slurm on github.com/biowdl/
  • We will experiment a bit. Should we go the cromwell approach (template everything so any hpc system can be used and miniwdl-hpc becomes a more apt name) or dedicate to SLURM and keep the configuration simpler (hopefully). In any case it might need to use some fiddling to get it working. I prefer dedicating it to SLURM, as other HPC systems can make use of DRMAA and miniwdl-DRMAA might be a much better approach for them.
  • As soon as we have a 0.1.0 version going that reasonably covers our use case we move the repo to miniwdl-ext and add links in the miniwdl documentation so users can find it and give feedback.

This way the miniwdl-ext repository will not end up with a half-baked project and we will have some freedom to experiment before we commit.

rhpvorderman avatar Jun 27 '22 05:06 rhpvorderman

@rhpvorderman Sounds great to me! Just LMK how I can help at any point

Besides the obvious plumbing around invoking docker via the cluster scheduler, the other pivotal design choice you'll find is in how to mount the input files for each task. As you know miniwdl tries to mount input files in situ instead of copying them, which is usually a significant advantage IMO. In cli_subprocess.py you'll see the approach of instructing the container runtime to individually map each input file to its virtualized path. In the miniwdl-aws batch_job.py we instead mount the whole filesystem and created symlinks from each virtualized path to the original input file. There are pros and cons to each approach (see eg this note) I'm sure you'll evaluate in due course.

mlin avatar Jun 29 '22 04:06 mlin

@mlin Thanks for the feedback. The standard on cluster systems is singularity. That allows mounting individual input files. Given that clusters often operate on a shared filesystem (usually NFS) direct mounting is possible. Singularity on cluster has some requirements though to make it work. So if I run into some things I will make that configurable functionality for singularity in this repo, and make the miniwdl-slurm extension set the defaults appropriately.

Thanks again, I will pitch this to my manager and start allocating some time. (The "summer holiday" is a great period for this). It is always a pleasure to work with you so I am looking forward to work on this project! Thanks!

rhpvorderman avatar Jun 29 '22 07:06 rhpvorderman

@rhpvorderman @DavyCats @illusional I put together an annotated example for a container backend plugin miniwdl-backend-example. This should greatly expedite getting started when the opportunity arises.

mlin avatar Jul 05 '22 06:07 mlin

@mlin. This is great. I think I will be able to bootstrap it pretty nicely from here. If not I will come back here.

rhpvorderman avatar Jul 22 '22 10:07 rhpvorderman

Short update. The current init branch is able to execute the self test on our cluster :tada:. EDIT: https://github.com/LUMC/miniwdl-slurm/tree/init

It required only a 100 lines of code... And a small PR that needs to be done on miniwdl itself here #579

@mlin. Thank you so much for MiniWDL. Doing similar work in Cromwell was always a gruelling battle where I would need to spend two days grepping to the code to find the actual part that I needed to change. (This is unfortunately no exaggeration.) Now I could get started right away to learn the structure.

I want to test it with a few production workflows first to see if it is actually working. I am now running into problems with the BioWDL pipelines in general related to miniwdl, not necessarily the backend. So I will open a few issues/PRs to get that solved and after that will continue developing the slurm plugin.

rhpvorderman avatar Aug 05 '22 13:08 rhpvorderman

Update: miniwdl-slurm is done https://github.com/LUMC/miniwdl-slurm

  • Tested with real production workflows on our own cluster
  • Tested with Github CI -> continuous integration present.
  • 100% line coverage.

cromwell vs miniwdl + miniwdl-slurm. What does it mean for people running WDL workflows on the slurm cluster?

category cromwell miniwdl + slurm
database needs a database. This can be MySQL, PostgreSQL, in-memory HSQLDB or in-memory with overflow file no database needed
runtime resource usage Cromwell requires at least 3 GB of real memory to run with in-memory database with overflow file. 30GB without the overflow file. The overflow file can easily grow to be multiple gigabytes in size. Cromwell utilizes multiple CPUs. 100MB memory. 1 cpu.
call caching not possible without the database. Speed limited by the database. uses a cache of JSON files. Fast.
configuration Requires a quite big configuration file to describe the database and the cluster submission A few lines in the configuration is sufficient.
robustness Allows retrying jobs. Can continue the workflow even if a job has failed. Allows retrying jobs. Cancels all other jobs in case of failure

Cromwell's database has caused us numerous headaches. We will be glad to switch to miniwdl. There are some convenience features still missing, but these will be reported on the issue tracker and it is nothing major.

@mlin I want to release miniwdl-slurm, I think it would be great both for the extension and miniwdl itself if this would be hosted on the miniwdl-ext group. This would make miniwdl look like a more HPC-ready solution than if it is kept in a third-party repo. What do you think about this? My requirement for this to happen is that I continue to have full ownership of the miniwdl-slurm repo.

@illusional creating miniwdl-sge will be very easy now. Simply copy the code, and change the srun commands to the appropriate qrsh commands. You need qrsh -now no at a minimum. -now no will prevent the job from running in the Interactive queue (which is extremely limited) and instead run in the normal queue.

rhpvorderman avatar Aug 24 '22 10:08 rhpvorderman

Fantastic!!!!

I think it would be great both for the extension and miniwdl itself if this would be hosted on the miniwdl-ext group. This would make miniwdl look like a more HPC-ready solution than if it is kept in a third-party repo. What do you think about this? My requirement for this to happen is that I continue to have full ownership of the miniwdl-slurm repo.

Definitely delighted to house the repo under miniwdl-ext with you as admin. I created the miniwdl-ext or to house miniwdl-related code that isn't formally sponsored/funded by some institution. For example, while some AWS folks help to maintain miniwdl-aws and work with their customers on it, they don't want it housed under the AWS GitHub org implying it's their product or automatically covered by their support contracts. And CZI continue to generously sponsor miniwdl itself, but not necessarily these plugins that their internal teams don't have as much need for.

Anyway, if you would like to see it under miniwdl-ext (emphasize it's up to you) I think the cleanest way is for you to initiate a repo transfer through the Settings tab on GitHub, which I would accept and add you as an admin.

Cancels all other jobs in case of failure

BTW there's a config option to let already-running tasks complete, [scheduler] fail_fast = false. I know it's becoming difficult to keep track of all these config options, got a background thread running on how to help with that.

mlin avatar Aug 25 '22 09:08 mlin

Definitely delighted to house the repo under miniwdl-ext with you as admin.

Great! Starting transfer now.

BTW there's a config option to let already-running tasks complete, [scheduler] fail_fast = false. I know it's becoming difficult to keep track of all these config options, got a background thread running on how to help with that.

Awesome! I checked the config options beforehand, but I missed it somehow. Thanks!

rhpvorderman avatar Aug 30 '22 06:08 rhpvorderman

Okay. Transfer is not working: You don’t have the permission to create public repositories on miniwdl-ext

Can you:

  • Create a new empty miniwdl-slurm repository under miniwdl-ext?
  • Give me admin access?

Then I can push the commits to the repository and delete the LUMC repo.

rhpvorderman avatar Aug 30 '22 06:08 rhpvorderman

Thanks @mlin! The repository is transferred now. I requested codecov access to the organization so we can get coverage reports on Pull requests as well as display a coverage badge.

rhpvorderman avatar Aug 30 '22 10:08 rhpvorderman