usher
usher copied to clipboard
Suggestion to update removal rules to adapt to the current mass recombinant situation
Usher removes seqs with >5 reversions or >20 private mutations to prevent artefact.
However, recently a lot of recombinants between DV, EG, JD, GK, FL, FU,... are popping up and sometimes these recombinants are usually removed due to having too many reversions or private mutations. The number of such removal events is quite a lot now.
For example
https://github.com/sars-cov-2-variants/lineage-proposals/issues/846
https://github.com/sars-cov-2-variants/lineage-proposals/issues/839
https://github.com/sars-cov-2-variants/lineage-proposals/issues/811
https://github.com/sars-cov-2-variants/lineage-proposals/issues/674
https://github.com/sars-cov-2-variants/lineage-proposals/issues/879
https://github.com/sars-cov-2-variants/lineage-proposals/issues/888
And these recomb are more and more. It becomes harder and harder to separate and list every of those removed recombinants and add them back to tree.
A suggestion is that for new seqs before they are removed check for recombinants and see if it is close to any potential recomb. If it is then not remove.
Or allow manually add stationery points that prevents removal for seqs close to such points so that at least seqs for each recombinant won't be removed after the initial detection.