tsinfer icon indicating copy to clipboard operation
tsinfer copied to clipboard

Add dependency on tsdate

Open jeromekelleher opened this issue 1 year ago • 3 comments

As part of the changes for tsinfer 1.0, I think we should add the ability to date the inferred trees by adding a dependency on tsdate. The motivation for doing this is that it's reasonably involved doing the full tsinfer+tsdate pipeline correctly, requiring multiple pre and postprocessing steps to get the best performance. As most people probably want to date their ARGs, then we should make it easy for them to do it as well as possible.

Similarly for doing multiple rounds of tsinfer based on inferred dates.

How we structure the APIs etc is something to figure out, but I'm sure we can manage it.

jeromekelleher avatar Jun 03 '24 13:06 jeromekelleher

Yep, discussing just now with @jeromekelleher and this seems sensible to me.

One suggestion is that tsinfer.date(ts_in, ts_out) is another step, just like tsinfer.match_samples(sd, ts_in). I think by default this should carry out something internally like:

ts_tmp = tsdate.preprocess_ts(ts_in)
return tsdate.date(ts_tmp, mutation_rate=mu)

Then tsinfer.infer() would be "shorthand" for the following 4 steps:

  1. tsinfer.generate_ancestors
  2. tsinfer.match_ancestors
  3. tsinfer.match_samples
  4. tsinfer.date # n.b. we could call this tsinfer.set_times() if we want to distinguish it from tsdate.date?

We need to think about the API, e.g. if we require a mutation rate (or if it is absent, whether we simply return an undated tree sequence).

hyanwong avatar Jun 03 '24 14:06 hyanwong

We need to think about the API, e.g. if we require a mutation rate (or if it is absent, whether we simply return an undated tree sequence).

That would be a good way to maintain backward compatibility - if mutation_rate is suppled to tsinfer.infer then the additional dating step is done, otherwise same as we have now.

jeromekelleher avatar Jun 03 '24 14:06 jeromekelleher

Sounds good to me.

benjeffery avatar Jun 03 '24 16:06 benjeffery