tskit icon indicating copy to clipboard operation
tskit copied to clipboard

Number of variable sites (not just num_sites)

Open hyanwong opened this issue 1 year ago • 1 comments

Just a quick thought: would it be helpful to cache the number of variable sites in a tree sequence, as well as the number of actual sites? I am starting to encounter cases where sites are defined but have no associated mutations. ts.num_variable_sites would seem like a sensible thing. I guess it might get hairy when there are mutations but no variation, however (e.g. if the mutations are reverted, or do not change the state)

hyanwong avatar Feb 07 '24 14:02 hyanwong

It's not trivial for the reasons you outline. Zero mutation sites is easy to do though, and we use that somewhere else

jeromekelleher avatar Feb 07 '24 19:02 jeromekelleher

Closing for inactivity and labelling "future", please re-open if you plan to work on this.

benjeffery avatar Jun 12 '25 22:06 benjeffery