tskit
tskit copied to clipboard
`mutation_mask` argument to site-mode statistics
Over in stdpopsim we're wanting to compute frequency spectra for only a certain set of mutations (the non-neutral ones, for instance). These are mixed right in with neutral ones (consider synonymous/nonsynonymous mutations.) To make this easier we could provide a mutation_mask argument, that applies only to statistics with mode="site", that is a boolean vector of length equal to the number of mutations, and only those mutations would be used. This would not affect the denominator (if span_normalize=True).
(Initially I thought this would be site_mask, but for cases with more than one mutation at a site, mutation_site is better.)
SGTM - would need some considerable replumbing though I fear, as there's no allowance for this sort of thing in the C API.