tskit icon indicating copy to clipboard operation
tskit copied to clipboard

Allow windows not bracketed by 0, L

Open jeromekelleher opened this issue 6 years ago • 5 comments

Currently windows in the stats calculations insist on starting with 0 and ending with L. Relax this to allow windows within the sequence.

jeromekelleher avatar Jun 11 '19 08:06 jeromekelleher

I've just found that I would like this behaviour for the workbooks.

Obviously, as a hack we could simply add the start and end positions then throw away the flanking regions. A stupid idea because it doesn't save on the calculation time, but it would serve as a reasonable way of running the tests.

hyanwong avatar Jun 14 '22 20:06 hyanwong

This should not be a hard thing to do, but will take a bit of time.

petrelharp avatar Jun 15 '22 00:06 petrelharp

How easy do you think it will be @petrelharp - is it something I can set a project student or @savitakartik on (I think she's looking at focussed stats along the genome)?

hyanwong avatar Jun 15 '22 06:06 hyanwong

It's not totally straightforward as it would require top-to-bottom changes in the stats API. IIRC we use a custom tree iterator for the stats API so it's not just a case of calling tree.seek(start). You'd need to update the code for each of the branch, site and node stats (like this function) to skip ahead and break off at the appropriate points, and the code is quite subtle so it would need to be done carefully.

jeromekelleher avatar Jun 15 '22 08:06 jeromekelleher

I didn't realise about the custom iterator. That does make it much harder, thanks for the info.

hyanwong avatar Jun 15 '22 08:06 hyanwong

Closing in favour of #2782

jeromekelleher avatar Jul 07 '23 11:07 jeromekelleher