nextclade
nextclade copied to clipboard
ENH: Report reverted deletions (as ranges) and use in QC
Right now, we only categorize private substitutions into a) reversions, b) labeled, c) unlabeled.
Since true reversions of deletions are very unlikely (unless there's contamination/coinfection/recombination) they contain valuable QC/recombinant-detection information.
It would be good to have that information. It is important, though, that ranges of deleted nucs are counted, not each deleted nuc individually as deletions are usually single events and contain the same information whether they are 3 base pairs or 30 bp long.
This is marked prio: low
for now.
Also useful but yet trickier would be to annotate reverted insertions. Right now, insertions aren't included in the reference tree. To find reverted insertions, we would need to infer insertions at all internal nodes (this is already done in ncov-simple).