pgsc_calc
pgsc_calc copied to clipboard
Improvements to PCA steps & default reference panel
Ideas for making PCA projections more robust
- [ ] Subsetting to a smaller set of PCA eligible variants?
- HapMap3 (same as
bigsnpr
) - Ancestry informative markers (similar to Hao et al.), such as those in doi:10.3389/fgene.2012.00322
- HapMap3 (same as
- [x] Avoid variants with high missingness in target dataset (use .vmiss files)
- Implemented in beta release
- [ ] Add back OCE samples to reference panel, but exclude from empirical calculation of Z-scores due to low number of individuals for comparison.
Previously implemented:
- Merged 1000 Genomes & Human Genome Diversity Project (HGDP):
- ref: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9900804/
- data: https://gnomad.broadinstitute.org/news/2020-10-gnomad-v3-1-new-content-methods-annotations-and-data-availability/#the-gnomad-hgdp-and-1000-genomes-callset