nextclade icon indicating copy to clipboard operation
nextclade copied to clipboard

Report aa mutations within a frame shift

Open jpflorido opened this issue 2 years ago • 2 comments

Hi there, Thanks for considering the detection of frame shifts in the new release! I was wondering whether there is a way of translating the whole gene when there is a frame shift (not only the previous positions of the frame shift) through the 'ignoreFrameShift' option in qc.json file. In some cases, there are low quality regions which could produce "false frameshifts" such as the one described in this ticket: https://github.com/nextstrain/nextclade/issues/469 There is an alignment in the sequence with N-- which, probably, should be --- . Would like to know if users can handle that with this new feature. Thanks, Javier

jpflorido avatar Oct 05 '21 12:10 jpflorido

Hi @jpflorido

we are generally treating N as a base present in the sequence for which is unknown whether it is ACGT(U). Treating them optionally also as gaps would be pretty difficult to implement in the alignment algorithm. Illumina reads should have very little indel error and we expect that N corresponds to a base, not a gap.

we don't translate sequence past the frame shift because the amino acid sequence would be pretty much unalignable. this is difficult to handle in a reference alignment framework.

we have been discussing to fall back on a codon-by-codon translation of the nucleotide alignment in these cases. but this isn't implemented yet. But this is a trade-off between biological accuracy and sequencing noise.

richard

rneher avatar Oct 12 '21 12:10 rneher

I just wanted to add to Richard's answer that there is no way around of the current behavior with any configuration options. Frame shift detection and translation happen before QC and the option to ignore frame shifts is purely cosmetical, it only excludes the detected frame shifts from counting towards QC score and from showing in the UI and output files.

ivan-aksamentov avatar Oct 12 '21 12:10 ivan-aksamentov