pyslim icon indicating copy to clipboard operation
pyslim copied to clipboard

add pyslim metadata to VCF output from msprime

Open bhaller opened this issue 6 years ago • 1 comments

Hi folks. For background, see this question on slim-discuss: https://groups.google.com/forum/#!topic/slim-discuss/etn7plcQRY8.

SLiM's VCF output contains a bunch of additional metadata: mutation IDs, mutation types, selection coefficients, dominance coefficients, etc. This is all documented in chapter 25 of the SLiM manual, particularly sections 25.2.3 and 25.2.4 (in the current version of the manual as I write this issue, anyway). This metadata is also available to pyslim, of course, and it would be great if it made it into VCF output from msprime following the conventions already defined by SLiM. This would let people who output VCF from Python have access to the metadata they need, for example to find mutations of a particular mutation type within the VCF output (as per the link above).

bhaller avatar Dec 02 '19 14:12 bhaller

This could be done, I would imagine, but we'd need some way of outputting VCF INFO data. I've opened an issue on tskit to track this: https://github.com/tskit-dev/tskit/issues/434

jeromekelleher avatar Dec 03 '19 08:12 jeromekelleher