eager
eager copied to clipboard
Chunking and stitching to parallelise alignment
It would be very useful to have automated splitting of large fastq files to parallelise alignment, and then have automated concatenation of the output aligned files. BWA alignment is often the most time-consuming step of processing.
This could speed up processing of e.g. mammalian ancient genome sequencing which can be 300-4000 million reads from a single library.
This would be good for a centralised nf-core module. One for the upcoming hackathon!
FYI, this could be a great option: https://github.com/bigdatagenomics/cannoli/issues/323 and it works out of the box with singularity and docker
This is already done with sharding
from @shyama-mama !
https://github.com/nf-core/eager/pull/1023