msprime icon indicating copy to clipboard operation
msprime copied to clipboard

Add RateMap.zip(other1, other2,...) method

Open jeromekelleher opened this issue 3 years ago • 6 comments

It would be useful to have a method that would combine two or more RateMaps and return an iterator over the distinct intervals at the rates. So, in usage, it would look like

for left, right, rate1, rate2, rate3 in rm1.zip(rm2, rm3):
    # Do stuff with the three different rates happening on the interval `left, right`.

Should be straightforward enough to implement.

jeromekelleher avatar May 11 '21 17:05 jeromekelleher

cc @mufernando @grahamgower

jeromekelleher avatar May 11 '21 17:05 jeromekelleher

I would volunteer to implement this, except I can't think of a clever way to do this in numpy. I've run across this exact problem many times and always end up doing a dumb/slow for loop. If anyone has any ideas on how to use numpy vectorized tricks to do that let me know and I can implement it.

mufernando avatar May 11 '21 17:05 mufernando

It would be nice to be able to define a RateMap with left and right intervals so that it automatically converts to the breakpoints representation. Or is this sth we want to do within stdpopsim?

mufernando avatar May 11 '21 18:05 mufernando

Since it's returning an iterator I don't think there's any point in trying to be clever, just a simple loop is the right way to go. From the stdpopsim perspective this is fine, because it wants to iterate over the intervals and do Python stuff to them.

jeromekelleher avatar May 11 '21 18:05 jeromekelleher

It would be nice to be able to define a RateMap with left and right intervals so that it automatically converts to the breakpoints representation. Or is this sth we want to do within stdpopsim?]

This is a separate issue - can you open one to track with an example of what you'd like to see?

jeromekelleher avatar May 11 '21 18:05 jeromekelleher

Since it's returning an iterator I don't think there's any point in trying to be clever

I agree with this, but if you did want to be clever with numpy, here's how to pre-compute the relevant things:

rm1 = msprime.RateMap(position=[0, 10], rate=[1.0])
rm2 = msprime.RateMap(position=[0, 3, 10], rate=[4.0, 5])
rm_list = [rm1, rm2]
pos = np.unique(np.concatenate([rm.position for rm in rm_list]))
rates = np.array([rm.rate[np.searchsorted(rm.position[:-1], pos, side='right') - 1] for rm in rm_list])
for k in range(len(pos)-1):
  print(f"segment {k}: {pos[k]} to {pos[k+1]} - rates {rates[:,k]}")

petrelharp avatar May 11 '21 18:05 petrelharp