tskit icon indicating copy to clipboard operation
tskit copied to clipboard

Is tree.num_samples_array useful/worth it?

Open hyanwong opened this issue 2 years ago • 4 comments

We have a tree.num_children_array but not a tree.num_samples_array. Is such a thing worth making available? I suspect we need to keep track of the number of samples under each node anyway. I recently wanted to be able to quickly check all the doubleton nodes in a tree sequence, and this would have been helpful. But perhaps it's just extra burden to maintain something like this, for little gain. Either way, we can record here for posterity what the the conclusion is.

hyanwong avatar Jun 23 '23 14:06 hyanwong

I think the reason I didn't implement it is because sample_counting is an optional feature of the tree class, which would require some extra layers of checking to be made safe. Otherwise it's definitely useful.

jeromekelleher avatar Jun 26 '23 13:06 jeromekelleher

I think we always count the number of samples (but it's their identity that's optional though), right?

hyanwong avatar Jun 26 '23 14:06 hyanwong

No it's optional at the c level

jeromekelleher avatar Jun 26 '23 20:06 jeromekelleher

No it's optional at the c level

Ah, I didn't realise, sorry. I see. Thanks for the clarification.

hyanwong avatar Jun 27 '23 13:06 hyanwong

Closing for inactivity.

benjeffery avatar Jun 12 '25 22:06 benjeffery