modules icon indicating copy to clipboard operation
modules copied to clipboard

Dedup subworkflow name

Open SPPearce opened this issue 2 years ago • 10 comments

I propose that we rename the bam_dedup_stats_samtools_umitools subworkflow to be just bam_dedup_umitools. Various other subworkflows generate stats using samtools, without mentioning it in the name of the subworkflow.

I think the main tool should be mentioned in the title, not the auxiliary tools.

SPPearce avatar Oct 17 '22 12:10 SPPearce

Also edit tests/config/pytest_modules.yml

Ah! Didn't spot that it was listed starting with subworkflows in here, checked under bam. Should have grepped of course...

SPPearce avatar Oct 17 '22 13:10 SPPearce

There's also bam_sort_stats_samtools? What do we want to do about that one?

edmundmiller avatar Oct 17 '22 13:10 edmundmiller

That's a toughy since it's both samtools...

matthdsm avatar Oct 17 '22 13:10 matthdsm

There's also bam_sort_stats_samtools? What do we want to do about that one?

That one does actually sort the bam and the gets the stats using samtools, so I'm less fussed about that.

SPPearce avatar Oct 17 '22 13:10 SPPearce

I'd even consider dropping stats from all the various workflows. I don't see much value in running it a million times when using subworkflows. I would much rather have 1 qc workflow/step I could put where I want, without being forced into running stats every time.

matthdsm avatar Oct 17 '22 13:10 matthdsm

⏫ at some point we're just duplicating code IMO

matthdsm avatar Oct 17 '22 13:10 matthdsm

Yeah, that's the tough part with subworkflows, it's opinionated. We can't get down to something atomic like the modules, it's just going to have to be community consensus. But without the stats, you're just basically running UMITOOLS_DEDUP :shrug:

edmundmiller avatar Oct 17 '22 14:10 edmundmiller

Maybe we could add an option to each subworkflow that defines whether the subworkflow runs the stats or not?

nvnieuwk avatar Oct 17 '22 14:10 nvnieuwk

Yep that would be useful :)

nvnieuwk avatar Oct 17 '22 14:10 nvnieuwk

Agree we should maybe have an option to switch the stats off. Still think we should keep it in the name though otherwise what distinguishes it from another subworkflow that doesn't include this at all? We should be as explicit as possible about what the subworkflow is actually doing so it's easier to find, use and reuse if required with other tools that could be drop in replacements. Happy to have another discussion about this next week maybe on a maintainers call?

drpatelh avatar Oct 17 '22 20:10 drpatelh

bam_dedup_stats_samtools_umitools sounds like it might be using samtools markdup, whereas the actual important part of the name is umitools.

SPPearce avatar Oct 18 '22 06:10 SPPearce

@drpatelh , @SPPearce What's the status of this PR? Are we good to merge or should we convert to draft and/or close? We're cleaning up open PR's so we have nice place to start the upcoming Hackathon!

Cheers Matthias - nf-core maintainers

matthdsm avatar Mar 07 '23 07:03 matthdsm

I'm still of the opinion that we should rename this subworkflow, and think that the other subworkflow names in the recent months are on the side of renaming it.

SPPearce avatar Mar 07 '23 10:03 SPPearce

We might want to merge https://github.com/nf-core/modules/pull/2979 before we proceed though

matthdsm avatar Mar 09 '23 10:03 matthdsm

Yes, and probably #2971 as well.

SPPearce avatar Mar 09 '23 10:03 SPPearce

What's happening with this PR, should we drop it and restart from scratch?

maxulysse avatar Nov 10 '23 13:11 maxulysse

Probably best to close this one, yes. Are we renaming the subworkflow then?

SPPearce avatar Nov 10 '23 14:11 SPPearce

Are we renaming the subworkflow then?

Let's make an issue for that, and discuss there, and then if needed we can have a PR

maxulysse avatar Nov 10 '23 14:11 maxulysse