nextclade icon indicating copy to clipboard operation
nextclade copied to clipboard

Friendly request for explanation over amino acid visualization in Nextrain

Open kagningemma opened this issue 2 years ago • 2 comments

I uploaded a SARS-CoV-2 sequence to nextclade. Nextclade result shows two deletions, but I do not understand the visualized output in nextclade. The nextclade output is shown below.

picture_nextrain (2)

In the image above, nextclade identified S:L242- and S:A243- deletions.
However, the CTT codon corresponding to L242 amino acid does not appear deleted in the Query nucleotide sequence, but it is deleted in the QueryAA sequence.
In the same image, codons GCT and TTA (22289-22294) present in the reference are missing in the Query nucleotide sequence. Therefore, one would expect nextclade to report deletions at amino acid residues 243 and 244 instead of 242 and 243. Can anyone kindly clarify this? Thank you,

kagningemma avatar Oct 08 '21 04:10 kagningemma

Hi @kagningemma

yes, this is an oddity of the way we currently associate events in the nucleotide and amino acid alignments.

the codon 244 is TTA which also translates to L. So whether the deletion is codon 242 and 243 or 243 and 244 in the amino acid sequence is completely equivalent. But these are not the same in the amino acid alignment. We are discussing ways to improve this. best, richard

rneher avatar Oct 14 '21 07:10 rneher

Possibly related to #608

corneliusroemer avatar Nov 24 '21 01:11 corneliusroemer