pixy
pixy copied to clipboard
Using pixy without an all sites VCF?
I have a fairly mundane need for pi and fst estimates that are in the ballpark but not necessarily the most accurate possible. We have a huge number of samples that I don't have the time or resources to generate individual gvcfs for. Can I use pixy on a standard VCF without calling all sites? Is there anything I should know when doing this?
If not are there any tools you all might recommend as an alternative?
Thanks!
Hi Miles,
There isn't a quick way around the missing data issue for pi/dxy, I'm afraid. All tools, including pixy, will give you biased estimates in the absence of an all-sites VCF. Note that FST doesn't have the same issue, and any tool will work for that.
The only alternative to the true 'all-sites' workflow that I am aware of is to use mop (https://github.com/RILAB/mop) on your BAM files, and use those results to ballpark the denominators for the estimates.
Sorry that I can't be of more help.
Kieran