msprime icon indicating copy to clipboard operation
msprime copied to clipboard

Integration of Selfing and Dormancy

Open TPPSellinger opened this issue 3 years ago • 9 comments

I have all permission to share our scripts to simulate genome with variation (piecewise) of selfing and dormancy. @jeromekelleher, let me know how to proceed. Since it's all rescaling we build python functions creating msprime "command lines" (i.e. parameters and models) . Stefan Strütt from the max planck institute of Köln did simulations to compare msprime to slim, demonstrating both approach are equivalent (paper should be coming soon). In my lab we are building a forward simulator, to do the same for dormancy.

TPPSellinger avatar Apr 16 '21 14:04 TPPSellinger

That's excellent, thanks for sharing @TPPSellinger! I guess the simplest thing would be to wait for the paper, and then we can think about how we might merge the functionality into msprime?

Or, if you'd like to get things in more quickly, perhaps someone could summarise the approach here and the proposed changes?

jeromekelleher avatar Apr 22 '21 07:04 jeromekelleher

I will look at it deeply and try to get a PR ready for when the paper is submitted/accepted. I hope this is okay for you, I don't really know how you would like this to be integrated into msprime. I had something in mind like for variation of population size (default being no selfing and no dormancy)

TPPSellinger avatar Apr 26 '21 11:04 TPPSellinger

We could help suggest how this would be implemented if you give even a brief summary of how this works? I'm imagining that selfing and dormancy act as rescalings on recombination rate, effective population size, and generation time? In particular: is it rescaling only the trees (sim_ancestry) or also mutation rate? When you say "variation" do you mean that these are changing in time, like 1000 years of 90% selfing followed by 1000 years of 70% selfing?

petrelharp avatar Apr 27 '21 16:04 petrelharp

Hi @petrelharp. My mistake for not giving more insight. It's exactly what you said, we change these rates so that they are piece-wise constant in time. Selfing will rescale Ne and recombination rate. So we change recombination rate and Ne through time to mimic the effect of selfing. For dormancy, it's slighlty more tricky. Depending from the hypothesis (or model), it will in addition scale the mutation rate. Here also, it's piecewise constant, and at generation x we change the scaling according to the new rates.

I hope this helped a little bit. For the moment everything is piecewise constant, but I can ask Stefan Strüt (he doesn't use github but I will ) if he has the formulas for smooth transitions. Let me know if you want me to put all the rescaling formulas I have here. I can also put out the python script we use (it's not very long and quite straight-forward), it will probably help make things clearer.

TPPSellinger avatar Apr 27 '21 16:04 TPPSellinger

Thanks! We don't need to formulas now; I'm just trying to think about the API. It's tricky because it affects both ancestry and mutations. Here's some options, brainstorming:

  • define a new AncestryModel that has selfing_rate and mean_dormancy_time arguments (that could be RateMaps to allow for things changing in time)? But, this won't deal with mutation rates; people would have to know how to adjust those when laying down mutations afterwards (which we could make easier somehow?).

  • add those arguments to the StandardCoalescent model.

  • add those arguments (or, similar) to sim_ancestry and sim_mutations and hook them up to the underlying machinery so that they scale things appropriately

  • write a method, sim_ancestry_with_selfing, that takes a demographic model and (a) transforms it (b) simulates from it (c) transforms the resulting tree sequence back

None of these options seem very attractive or easy. Any suggestions?

petrelharp avatar Apr 27 '21 17:04 petrelharp

Hey! Just dropping a line because I am very interested in using this functionality! Is there any update or anything I can help with?

David-Peede avatar Aug 16 '23 21:08 David-Peede

I happen to have communicated with @TPPSellinger recently on another issue; he's moved to industry so I think there's no update. So - it's open and available to help with, if you're interested!

petrelharp avatar Aug 17 '23 05:08 petrelharp

Hi ! Yes unfortunately I have moved to industry. But I try to finish everything that I started during my postdocs. We recently published a paper using msprime to simulate selfing (https://elifesciences.org/articles/82384). The scripts should be available somewhere in the paper. They should also be on my github. I hope this can help for the momen and make the integration of selfing into msprime easier in the future.

TPPSellinger avatar Aug 17 '23 17:08 TPPSellinger

Amazing! Thank you so much! I am currently wrapping up a manuscript that should hopefully be submitted by the end of the month, but in the mean time I will check out the code!

David-Peede avatar Aug 17 '23 18:08 David-Peede