wrangling-genomics Assessing Read Quality: Improve the visualisation of Phred Quality Scores

Assessing Read Quality: Improve the visualisation of Phred Quality Scores

Open theheking opened this issue 3 years ago • 1 comments

I think the Details on the FASTQ format section is fairly confusing for beginners. For a better readibility and understanding of the quality scores. I think that the sentence "This quality score is logarithmically based, so a quality score of 10 reflects a base call accuracy of 90%, but a quality score of 20 reflects a base call accuracy of 99%" should be paired with the classic table displaying Quality Score.

Quality Score	Probability of Base Error	Base Confidence	Sanger Encoded ASCII Character
10	0.1	90%	"+"
20	0.01	99%	"5"
30	0.001	99.9%	"?"
40	0.0001	99.99%	"I"

Apr 16 '21 13:04 theheking

Agreed. Also, with the increasing use of NovaSeq as a sequencing platform, it would be good to include an explanation of NovaSeq quality scores. The quality scores are binned and correspond to marginal (<Q15, reported value of 12), medium (~Q20, reported value of 23), high (>Q30, reported value of 37), and a null score for no-calls is reported as 2 (https://www.illumina.com/content/dam/illumina-marketing/documents/products/appnotes/novaseq-hiseq-q30-app-note-770-2017-010.pdf)

Dec 01 '22 19:12 amishaporet

wrangling-genomics wrangling-genomics copied to clipboard

Assessing Read Quality: Improve the visualisation of Phred Quality Scores

wrangling-genomics
wrangling-genomics copied to clipboard