tsinfer
tsinfer copied to clipboard
Allow msprime.RateMap or equivalent to be used in pip installs
In the CLI we use msprime.RateMap.read_hapmap to read in a HapMap-format file. But as @jeromekelleher says
We don't have binary wheels for msprime on Windows, and our install is therefore failing on pip (for the reasons that we don't have binary wheels on windows).
This is annoying because we aren't actually using the C parts of msprime, just the python implementation of RateMap (specifically, the msprime intervals.py file). We should reactivate "--recombination-map" in the CLI support somehow, e.g. by adding that file to the tsinfer repo, either as a copy or using submodules.
Any thoughts on the least bad option here @benjeffery ?
If it's submodule vs copy, I'd go with copy. Another option is to try to import msprime when it is needed and give a helpful error message asking to install it. The ideal solution is to get windows wheels working for msprime...
The ideal solution is to get windows wheels working for msprime...
if this is on the cards in the mid-future, that would be the best option.
We could perhaps put a copy of the intervals.py file in this repo as a short-term measure, and then remove it when msprime windows wheels are working, assuming that's planned for the new-to-mid term future?
A better solution might be to move intervals.py into tskit. We will probably be dealing with these tedious issues for getting ratemaps into tskit for LS matching soon, so that might tip the scales.
A better solution might be to move intervals.py into tskit. We will probably be dealing with these tedious issues for getting ratemaps into tskit for LS matching soon, so that might tip the scales.
Oh yes, it's a great point that we might need something like this for the HMM matching parameters in tskit. Are we happy enough with the intervals.py API to incorporate it into tskit as-is, or does it need work? I'm happy to open a PR to move it in.
FWIW @stsmall was looking to use the --recombination-map functionality in the tsinfer CLI.
If it's a clean swap-in then yes, we should probably just move the code straight up into tskit.
We can leave the file in msprime until the tskit version is released, and then bump msprime's requirements.
We can now reactivate this because intervals.py is present (undocumented) in tskit 0.5.4. Do we want to do that now, or wait until it is documented. No hurry, I guess.