wrangling-genomics icon indicating copy to clipboard operation
wrangling-genomics copied to clipboard

Wording of aligner choice in lesson 04-variant_calling

Open JanaSperschneider opened this issue 5 years ago • 1 comments

Hi,

thanks for putting together a great resource for novices to genomics data analysis!

I would suggest changing the wording in the subsection "Align reads to reference genome". Currently it reads: "We will use the BWA-MEM algorithm, which is the latest and is generally recommended for high-quality queries as it is faster and more accurate."

I would change the focus to enabling learners to make an informed choice based on their sequencing read types and use case, instead of saying BWA-MEM is the latest, faster and more accurate (than what?). For example, minimap2 is much faster than BWA-MEM and claims to be more accurate. However, minimap2 is not suited to spliced alignment.

I would say something like this instead: "We will use the BWA-MEM algorithm, which is suited well to aligning accurate short-read transcriptomic Illumina data to genomic sequences. Alternatively, aligners such as minimap2 are well-suited for aligning noisy long-read data or short-read genomic Illumina data. The appropriate choice of aligner depending on the sequencing read types is crucial for down-stream high-quality genomic data analysis and some time should be spent choosing the best tool for the job."

Thanks, Jana

JanaSperschneider avatar Sep 15 '19 21:09 JanaSperschneider

Hi @JanaSperschneider ! Thank you very much for this comment!

I quite like the wording you propose; would you be willing to put in a PR with these changes?

fpsom avatar Sep 16 '19 07:09 fpsom