eager icon indicating copy to clipboard operation
eager copied to clipboard

Chunking and stitching to parallelise alignment

Open pontussk opened this issue 3 years ago • 2 comments

It would be very useful to have automated splitting of large fastq files to parallelise alignment, and then have automated concatenation of the output aligned files. BWA alignment is often the most time-consuming step of processing.

This could speed up processing of e.g. mammalian ancient genome sequencing which can be 300-4000 million reads from a single library.

pontussk avatar Sep 15 '21 08:09 pontussk

This would be good for a centralised nf-core module. One for the upcoming hackathon!

jfy133 avatar Sep 16 '21 09:09 jfy133

FYI, this could be a great option: https://github.com/bigdatagenomics/cannoli/issues/323 and it works out of the box with singularity and docker

yassineS avatar Oct 06 '21 00:10 yassineS

This is already done with sharding from @shyama-mama !

https://github.com/nf-core/eager/pull/1023

jfy133 avatar Mar 15 '24 14:03 jfy133