tskit icon indicating copy to clipboard operation
tskit copied to clipboard

Make tqdm(ts.variants()) work better

Open jeromekelleher opened this issue 3 years ago • 6 comments

It would be nice if this would pull out the overall length of the iter. Should be easy enough?

jeromekelleher avatar Nov 02 '22 13:11 jeromekelleher

Should be straightforward, we just have to abstract the code for the variants function into an iterator class (which returns the right len), a bit like the current TreeIterator class.

jeromekelleher avatar Nov 02 '22 14:11 jeromekelleher

Just revisiting this. The variants() method has left and right parameters, so it's a bit more complicated. An iterator would need to know not just ts.num_sites, but the number of sites between left and right. I suspect we would also want to change the trees() iterator to have left and right too (see https://github.com/tskit-dev/tskit/issues/24), so perhaps we should implement that first, change the TreeIterator to account for that, then use the same basic function to add len to the variants iterator.

hyanwong avatar May 02 '23 10:05 hyanwong

I don't think the trees iterator is needed for this, as we use the Tree class directly.

jeromekelleher avatar May 02 '23 10:05 jeromekelleher

Sorry, I mean that the code that produces the TreeIterator wrapper could generalised to provide a wrapper that would work (subclassed, presumably) for the Variants. But only if there was a generalised way to deal with left and right.

hyanwong avatar May 02 '23 12:05 hyanwong

Breakpoints are stored on the tree sequence object, and are accessible from Python, so I think that could be used to get a count of the trees in the interval for len

benjeffery avatar May 03 '23 09:05 benjeffery

Breakpoints are stored on the tree sequence object, and are accessible from Python, so I think that could be used to get a count of the trees in the interval for len

Yeah, this does it, I think:

ts._check_genomic_range(left, right)
breaks = ts.breakpoints(as_array=True)
left_index = breaks.searchsorted(left, side="right")
right_index = breaks.searchsorted(right, side="left")
num_trees = right_index - left_index + 1

hyanwong avatar May 03 '23 10:05 hyanwong