tskit
tskit copied to clipboard
Number of variable sites (not just num_sites)
Just a quick thought: would it be helpful to cache the number of variable sites in a tree sequence, as well as the number of actual sites? I am starting to encounter cases where sites are defined but have no associated mutations. ts.num_variable_sites would seem like a sensible thing. I guess it might get hairy when there are mutations but no variation, however (e.g. if the mutations are reverted, or do not change the state)
It's not trivial for the reasons you outline. Zero mutation sites is easy to do though, and we use that somewhere else
Closing for inactivity and labelling "future", please re-open if you plan to work on this.