workflows issues

Add intentional pre and post checks for file emptiness

It's a relatively common standard in the bioinformatics tools we wrap to not have any special handling for empty inputs or outputs (including headered files without content/alignments). These "empty" files...

a-frantz

Lint fixes

Add trailing commas. Add blank lines between elements.

adthrasher

feat: wdl-format on everything

a-frantz

Fragility in workflows around malformed/missing RG records

a-frantz

more dynamically allocated mem

Some of our tasks allocate a large static amount of RAM that is often an over-allocation for many inputs. One example here: https://github.com/stjudecloud/workflows/blob/main/tools/ngsderive.wdl#L366

a-frantz

Tasks that take both pos/name sorted probably need different resources

Major culprit here is HTSEQ: https://github.com/stjudecloud/workflows/blob/main/tools/htseq.wdl It has a pretty terrible sort algorithm and eats up resources when the input is position sorted. We've exposed the name sort option but...

a-frantz

FastQC `prefix` convention is fragile

See here: https://github.com/stjudecloud/workflows/blob/main/tools/fastqc.wdl#L59 The above may not work if `prefix` is messed with.

a-frantz

Expose more parameters of tools

1

In the early days of this repo we tended to only expose parameters we use. We've since gotten much better at exposing parameters as we add tools. But there are...

a-frantz

Better document where GZIP input is required and where it's optional (but recommended)

I think it's appropriate to default to gzipped inputs and that should be the standard we support. I don't see a need to go out of our way to support...

a-frantz

enhancement

support Single-Ended data

All of our workflows and (almost) all of our tasks assume that data is Paired-End. SE support would make our workflows and tools more accessible.

a-frantz

workflows
workflows copied to clipboard

Metadata

Add intentional pre and post checks for file emptiness

Lint fixes

feat: wdl-format on everything

Fragility in workflows around malformed/missing RG records

more dynamically allocated mem

Tasks that take both pos/name sorted probably need different resources

FastQC `prefix` convention is fragile

Expose more parameters of tools

Better document where GZIP input is required and where it's optional (but recommended)

support Single-Ended data

← Metadata

Owner

Metadata

workflows workflows copied to clipboard

Metadata

← Metadata

Owner

Metadata

workflows
workflows copied to clipboard