cov-spectrum-website
cov-spectrum-website copied to clipboard
ENH: Include targeted Jaccard similarity of mutations in comparing variant to baseline pages
I'm increasingly using Jaccard similarity to find defining mutations for lineages. Say, if I look for a lineage with S:L452R in the US, and I want to know what other mutations co-occur: I sort by Jaccard similarity.
That's great and useful. Howevever, it would be even better if Jaccard similarity wasn't calculated with respect to ALL sequences but just with respect to a (configurable) baseline variant.
Turns out, it wouldn't be difficult (I hope) to add mutations and Jaccard similarity to the "compare variant to baseline" page:
That's the perfect place for a targeted Jaccard similarity.
The advantage would be that mutations occurring in BA.1 wouldn't reduce Jaccard similarity when I'm studying BA.2 sublineages. Does this make sense? Happy to explain in more detail if anything is unclear.