cov-spectrum-website icon indicating copy to clipboard operation
cov-spectrum-website copied to clipboard

ENH: Include targeted Jaccard similarity of mutations in comparing variant to baseline pages

Open corneliusroemer opened this issue 2 years ago • 0 comments

I'm increasingly using Jaccard similarity to find defining mutations for lineages. Say, if I look for a lineage with S:L452R in the US, and I want to know what other mutations co-occur: I sort by Jaccard similarity.

That's great and useful. Howevever, it would be even better if Jaccard similarity wasn't calculated with respect to ALL sequences but just with respect to a (configurable) baseline variant.

Turns out, it wouldn't be difficult (I hope) to add mutations and Jaccard similarity to the "compare variant to baseline" page: image

That's the perfect place for a targeted Jaccard similarity.

The advantage would be that mutations occurring in BA.1 wouldn't reduce Jaccard similarity when I'm studying BA.2 sublineages. Does this make sense? Happy to explain in more detail if anything is unclear.

image

corneliusroemer avatar Apr 07 '22 21:04 corneliusroemer