HTStream
HTStream copied to clipboard
What should be done about Novaseq base qualities?
Instead of quality scores from 2 to 40, the Novaseq (and maybe iSeq100 and Nextseq?) has quality scores of only 2, 12, 23 and 37.
-
How should this be dealt with for quality score based window trimming?
-
How should this be dealt with for overlapping reads and resolving mis-matched bases?
-
Should data from a Novaseq etc be made to conform to the 2, 12, 23, 37 quality scheme (introducing more complexity to the different algorithms)?
-
Perhaps a new tool that tries to improve/correct quality scores could be useful.
See https://lh3.github.io/2017/07/24/on-nonvaseq-base-quality and http://lh3.github.io/2014/11/03/on-hiseq-x10-base-quality
For some discussion on this issue.