HTStream icon indicating copy to clipboard operation
HTStream copied to clipboard

What should be done about Novaseq base qualities?

Open samhunter opened this issue 5 years ago • 1 comments

Instead of quality scores from 2 to 40, the Novaseq (and maybe iSeq100 and Nextseq?) has quality scores of only 2, 12, 23 and 37.

  1. How should this be dealt with for quality score based window trimming?

  2. How should this be dealt with for overlapping reads and resolving mis-matched bases?

  3. Should data from a Novaseq etc be made to conform to the 2, 12, 23, 37 quality scheme (introducing more complexity to the different algorithms)?

  4. Perhaps a new tool that tries to improve/correct quality scores could be useful.

See https://lh3.github.io/2017/07/24/on-nonvaseq-base-quality and http://lh3.github.io/2014/11/03/on-hiseq-x10-base-quality

For some discussion on this issue.

samhunter avatar Dec 12 '19 18:12 samhunter