nextclade
nextclade copied to clipboard
Inconsistency in how Ns are handled for genes vs nucleotide view
When I upload a partial genome, it is not evident in gene view that there are missing parts. It looks like the sequence is equal to reference in parts that are cut out.
data:image/s3,"s3://crabby-images/b49bc/b49bcb6ad1138a46b509e4968e3ac6750d0b627e" alt="image"
The same is true for the column Ns
or missing
. It doesn't count missing parts of the genome.
In nucleotide view, however, unsequenced parts are marked up and distinguishable (grey, with tooltip).
data:image/s3,"s3://crabby-images/f179f/f179f4b1dacf11e19c4ca3022c17c39ef3a92e31" alt="image"
It may be good to surface missing beginnings and ends in gene view and in the Ns
column tooltip. I think we have the information, it should be available through alignment start/end. It could be displayed in the same tooltip but under a separate heading (as missing start/end) and maybe counted in parentheses (like we display known frame shifts).
To reproduce, you can use this sequence short.txt
Or this one, maybe better since it contains some S mutations short.txt
missing beginnings and ends
They are mentioned in the alignment start and end. However it is unclear how to map that to aminoacids.
Spotted someone raising this issue independently in the wild. Would be great if we could display alignmentStart/End on the gene view https://github.com/cov-lineages/pango-designation/issues/843#issuecomment-1196584005