msms
msms copied to clipboard
Unexpected behavior with -es and selection
When using the es flag to split a population, the allele frequencies of the trajectories are not averaged as I would expect. I.e for the command at the end, A is fixed in population 2 and absent from population 1. Then A receives 1% of immigrants (-es), and the allele frequency becomes ~0.5, when most people would probably expect a weighted mean, i.e an allele frequency of around 0.01
msms 1 1 -r 0 -t 2 -I 1 1 0
-es 0.01 1 0.1
-ej 1. 2 1
-SI 0.011 2 0 1
-N 1000 -oTrace > bug
Edit: OK, I found that weighting is done using population sizes, not split proportions. So, using a split flag of
-es 0.01 1 0.1 -en 0.01 2 0.1
solves the issue, but the behavior is still somewhat surprising, and perhaps could be mentioned in the manual.
Not sure i follow. In pastward time a split (-es) with a 0.1 chance of lineages moving to the new deme means in forward time 10% of the current population is made up from that deme. This is sampled as well. A join in pastward time (-ej) is a split in forward time and the new population samples from the parent population. In pastward time all lineages are moved out of the deme joined.
On Tue, Jun 9, 2015 at 6:25 PM, BenjaminPeter [email protected] wrote:
When using the es flag to split a population, the allele frequencies of the trajectories are not averaged as I would expect. I.e for the command at the end, A is fixed in population 2 and absent from population 1. Then A receives 1% of immigrants (-es), and the allele frequency becomes ~0.5, when most people would probably expect a weighted mean, i.e an allele frequency of around 0.01
msms 1 1 -r 0 -t 2 -I 1 1 0 -es 0.01 1 0.1 -ej 1. 2 1 -SI 0.011 2 0 1 -SA 1000 -N 1000 -Sc 0 1 0 -oTrace -SFC > bug
— Reply to this email directly or view it on GitHub https://github.com/delt0r/msms/issues/41.
I have no special talents. I am only passionately curious. --Albert Einstein
OK, I try to spell out my expectation for the split (-es), to hopefully make things clear:
Backward in time:
Population 1 splits into two population X and Y.
Lineages in pop 1 at the split should pick deme X with probability 0.1, deme Y with probability 0.9. So if the current frequency at the split in population 1 is p_1, the initial frequencies in populations X and Y should also be p_1, since we expect lineages to pick their ancestral deme independent of their genotype.
Forward in time
Populations X and Y merge into population 1, and population 1 is made up of 90% individuals from Y and 10% individuals from X. Now, at the split time, the frequency of the selected allele in X is p_x=1 and in Y it is p_y=0. So if we pick a random individual right after the split, I would expect it to be with probability p_x * 0.1 + p_y * 0.9 a carrier of the selected allele, so the allele frequency should be 10%.
Now, obviously these models for the frequency differ from each other, however as trajectories are simulated forward in time, I was expecting the mixing I described. Currently, the mixing proportion is p_x * N_x + p_y * N_y, where N are the sizes of X and Y, which was not my first guess. Changing the population sizes immediately before the -es event to the mixture proportions is a workaround, but it might be nice to document that behavior somewhere in the docs.