eager icon indicating copy to clipboard operation
eager copied to clipboard

Deduplication Parallelization

Open apeltzer opened this issue 7 years ago • 1 comments

We could think about splitting BAMs as DeDup/MarkDup takes quite some time normally and use a file.size>2GB (or similar operator) to speed up things significantly. A subsequent merge would be a matter of minutes, automatically creating the same output for downstream analysis as before.

apeltzer avatar Oct 10 '18 11:10 apeltzer

I was about to suggest closing this as we aren't really promoting use of DeDup anymore other than niche cases, but I see it is also valid for MarkDuplicates so renamed.

jfy133 avatar Dec 03 '20 07:12 jfy133

Done: https://github.com/nf-core/eager/pull/944

jfy133 avatar Feb 09 '24 10:02 jfy133