tskit
tskit copied to clipboard
Method to "reverse" a tree sequence
It's useful for algorithm development to be able to "flip" the coordinates of a tree sequence around so that we read the trees and sites in the opposite order.
I propose adding a method like this:
def mirror_coordinates(ts):
"""
Returns a copy of the specified tree sequence in which all
coordinates x are transformed into L - x.
"""
L = ts.sequence_length
tables = ts.dump_tables()
left = tables.edges.left
right = tables.edges.right
tables.edges.left = L - right
tables.edges.right = L - left
tables.sites.position = L - tables.sites.position - 1
# TODO migrations.
tables.sort()
return tables.tree_sequence()
I think this is correct, but I'd have to sit down and write a bunch of tests to be sure. (I'm particularly fuzzy about what happens when sites are are not discrete - I guess the above code has to be wrong then because we'll have negative site positions for x < 1. Probably easiest to just raise an error if not discrete_genome.
Some questions:
- Should this be a method of TableCollection or TreeSequence (I guess we could do the usual thing and have an "in place" method that transforms the TableCollection?)
- Is the name ok? I think
mirror_coordinatesis better than something like "reverse" as that could mean a number of things. Maybe "reflect" or something else?
Any thoughts @benjeffery @petrelharp
cc @astheeggeggs
I agree that this would be an in-place operation on a TableCollection. If it found wider use you can always add a TreeSequence method later that does the operation on a copy.
As for naming how about flip_sequence_coordinates?
I like fiip_sequence_coordinates, but wouldn't reverse_sequence_coordinates be even better? I'm not a big fan of "mirror", as it's not what people would first search for (they'd probably search for "reverse"). I also agree that an in-place method of TableCollection is good. I am tempted to keep it out of the TreeSequence namespace since it's a niche operation that we don't expect most users to do, but maybe this is not something to start trying to do.
Let's go with reverse_sequence_coordinates then just as a TableCollection method. If someone asks for it on the TreeSequence method we can add it.