wrangling-genomics
wrangling-genomics copied to clipboard
Wording of aligner choice in lesson 04-variant_calling
Hi,
thanks for putting together a great resource for novices to genomics data analysis!
I would suggest changing the wording in the subsection "Align reads to reference genome". Currently it reads: "We will use the BWA-MEM algorithm, which is the latest and is generally recommended for high-quality queries as it is faster and more accurate."
I would change the focus to enabling learners to make an informed choice based on their sequencing read types and use case, instead of saying BWA-MEM is the latest, faster and more accurate (than what?). For example, minimap2 is much faster than BWA-MEM and claims to be more accurate. However, minimap2 is not suited to spliced alignment.
I would say something like this instead: "We will use the BWA-MEM algorithm, which is suited well to aligning accurate short-read transcriptomic Illumina data to genomic sequences. Alternatively, aligners such as minimap2 are well-suited for aligning noisy long-read data or short-read genomic Illumina data. The appropriate choice of aligner depending on the sequencing read types is crucial for down-stream high-quality genomic data analysis and some time should be spent choosing the best tool for the job."
Thanks, Jana
Hi @JanaSperschneider ! Thank you very much for this comment!
I quite like the wording you propose; would you be willing to put in a PR with these changes?