wrangling-genomics
wrangling-genomics copied to clipboard
change quality encoding scale in 02 Assessing Read Quality
In 02 Assessing Read Quality, quality encoding only goes up to 40, but J = 41, and K = 42. This is confusing because the read we use as an example has Js in it. We should change the example from:
Quality encoding: !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHI
| | | | |
Quality score: 0........10........20........30........40
Quality encoding: !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJK
| | | | |
Quality score: 0........10........20........30........40..
This would be solved by #127.
#127 has merge conflicts. Any volunteers to fix that up? 😄
I'm not really sure that #127 is still applicable. In the updated version, there is a short description of the FASTQ format (Section "Details on the FASTQ format" in the 2nd episode), which could be removed if already present in the shell genomics one. However, I would argue (for completeness purposes) that we maintain that part in - maybe put it under a different color? @aschuerch @taylorreiter thoughts?
@fpsom these lessons are already quite long. I would vote to link to the shell genomics lesson that covers this material, which was updated with here. Does this sound ok? If so, I will address merge conflicts with #127
@taylorreiter I definitely won't argue about the length. However, I'd be much happier if, after removing the entire section, we insert a sentence and a direct link to the shell genomics section, so that the learners can easily connect there for reference. Sounds reasonable?
That sounds like a great plan! I will put in a PR to address this by the end of the week. Because #127 has merge conflicts given that it was suggested prior to the lesson update, I will put in a new PR that is in the same spirit as #127 if that's ok with you @fpsom!
At this moment it is present in both Shell and Wrangling, however the example in Shell (with a lot of NNNs) makes it hard to explain the phred scores
AZ bbq: We agree that this quality score section should only be shown once.
One of our learners spotted this during our workshop too and it caused confusion why the score shown ends at "I" but the example read immediately below contains "J"