miniwdl
miniwdl copied to clipboard
Keep going with independent tasks
When a WDL task fails (run via AGC), I've observed that all other tasks are killed. I'd like miniwdl
to execute as many tasks as possible until there are no more tasks that are independent of the failure(s). See snakemake's --keep-going
option and Nextflow's errorStrategy
. This would allow me to complete as many tasks as possible, so that the next time I re-run the workflow while caching tasks, I'll be closer both to completion as well as having the task that failed start sooner (to see if it works this time).
There's a config option [scheduler] fail_fast = false
/ env MINIWDL__SCHEDULER__FAIL_FAST=false
that should do this. I don't think we have a test case for it with AWS specifically, but that scheduler logic is "above" the container backend. Let me know if it doesn't work.
Where’s a good place to contribute documentation for my future self about these config options?
That default.cfg is commented extensively, but the docs on configuration do a really mediocre job of linking out to it -- that's probably the low-hanging fruit
Another likely problem is that AGC doesn't yet make it super convenient to set the more-advanced config options (that don't have dedicated command-line arguments). https://github.com/aws/amazon-genomics-cli/pull/420 would help with that dankly.
Until that's available I think you'd have to
- copy the AWS-specific cfg file which gets baked into the docker image AGC uses for miniwdl
- add desired overrides to your copy of the file
- include it in the workflow source directory (so that it gets bundled up into the zip file that AGC sends into the context)
- set
{"engineOptions": "--cfg path/to/custom.cfg"}
in the MANIFEST.json
Thanks @mlin this is super helpful and thank you for your patience. I’m really enjoying using miniwdl so hopefully these questions are not seen as a criticism but a desire to understand, and contribute back in some small part.