modules
modules copied to clipboard
new subworkflow: Gatk4/Align and Preprocess
This will add a subworkflow to follow the gatk4 best practices for aligning and preprocessing sequencing data to be used for the gatk variant calling workflows (joint germline, tumour normal, tumour paired, create panel of normals)
This will be written to accept either fastq files (what is usually used) or ubams (what gatk recommend) as input.
- Ubams will be converted to fastq and aligned with BWA then gatk mergebamalignments is used to recover data from the unaligned reads to be used alongside their aligned counterparts.
- Fastq files are just aligned using BWA
- gatk4 markduplicates is then used to mark duplicate reads and the files are then sorted and indexed
- Finally the two step BQSR process is run to recalibrate files. gatk4 baserecalibrator and applybqsr
- These reads can then be used for GATK variant calling
best practices: https://gatk.broadinstitute.org/hc/en-us/articles/360035535912-Data-pre-processing-for-variant-discovery
- [x] This module does not exist yet with the
nf-core modules listcommand - [x] There is no open pull request for this module
- [x] There is no open issue for this module
- [x] If I'm planning to work on this module, I added myself to the
Assigneesto facilitate tracking who is working on the module