tsinfer icon indicating copy to clipboard operation
tsinfer copied to clipboard

Allow msprime.RateMap or equivalent to be used in pip installs

Open hyanwong opened this issue 3 years ago • 8 comments

In the CLI we use msprime.RateMap.read_hapmap to read in a HapMap-format file. But as @jeromekelleher says

We don't have binary wheels for msprime on Windows, and our install is therefore failing on pip (for the reasons that we don't have binary wheels on windows).

This is annoying because we aren't actually using the C parts of msprime, just the python implementation of RateMap (specifically, the msprime intervals.py file). We should reactivate "--recombination-map" in the CLI support somehow, e.g. by adding that file to the tsinfer repo, either as a copy or using submodules.

hyanwong avatar Oct 25 '22 09:10 hyanwong

Any thoughts on the least bad option here @benjeffery ?

hyanwong avatar Oct 27 '22 08:10 hyanwong

If it's submodule vs copy, I'd go with copy. Another option is to try to import msprime when it is needed and give a helpful error message asking to install it. The ideal solution is to get windows wheels working for msprime...

benjeffery avatar Oct 27 '22 09:10 benjeffery

The ideal solution is to get windows wheels working for msprime...

if this is on the cards in the mid-future, that would be the best option.

hyanwong avatar Oct 27 '22 17:10 hyanwong

We could perhaps put a copy of the intervals.py file in this repo as a short-term measure, and then remove it when msprime windows wheels are working, assuming that's planned for the new-to-mid term future?

hyanwong avatar Nov 09 '22 17:11 hyanwong

A better solution might be to move intervals.py into tskit. We will probably be dealing with these tedious issues for getting ratemaps into tskit for LS matching soon, so that might tip the scales.

jeromekelleher avatar Nov 09 '22 19:11 jeromekelleher

A better solution might be to move intervals.py into tskit. We will probably be dealing with these tedious issues for getting ratemaps into tskit for LS matching soon, so that might tip the scales.

Oh yes, it's a great point that we might need something like this for the HMM matching parameters in tskit. Are we happy enough with the intervals.py API to incorporate it into tskit as-is, or does it need work? I'm happy to open a PR to move it in.

FWIW @stsmall was looking to use the --recombination-map functionality in the tsinfer CLI.

hyanwong avatar Nov 10 '22 09:11 hyanwong

If it's a clean swap-in then yes, we should probably just move the code straight up into tskit.

We can leave the file in msprime until the tskit version is released, and then bump msprime's requirements.

jeromekelleher avatar Nov 10 '22 09:11 jeromekelleher

We can now reactivate this because intervals.py is present (undocumented) in tskit 0.5.4. Do we want to do that now, or wait until it is documented. No hurry, I guess.

hyanwong avatar Jan 20 '23 10:01 hyanwong