lineage-proposals icon indicating copy to clipboard operation
lineage-proposals copied to clipboard

Lineages with potentially large deletions masquerading as stretches of Ns

Open corneliusroemer opened this issue 6 months ago • 13 comments

Inspired by @cassiawag's work on large deletions in ORF7/8 often masquerading as stretches of Ns in assemblies rather than as deletions, I had a look through currently submitted sequences to see if I can find more examples.

@ryhisner has already pointed out that GW.5.1.1 seems to have a large ORF8 deletion.

I thought it might be useful to add further examples here. There's a potential for a simple tool to be written that would autodetect such suspicious stretches of Ns and be able to distinguish between amplicon dropout and deletions, e.g. by checking that the same stretch of N is observed in almost all sequences from a lineage and also from a number of labs/countries etc.

I found:

  • GW.5.1.1: 27395-28246 - already known - but suspiciously identical with GE.1.2...
  • GE.1.2: around 27395-28246, encompassing end of ORF6, ORF7a, ORF7b and into ORF8

I didn't expect them to have the same stretch - maybe these macro deletions are due to amplicon dropout in mostly English samples after all. Could be that there is a real short indel that throws primers off in that region.


Edited by mod.: Lineages spotted by community ( see comments below) : FW.1.1 JP.1.1 XBB.1.5.71 (50%) EG.5.1.12

corneliusroemer avatar Dec 20 '23 16:12 corneliusroemer