eems-around-the-world
eems-around-the-world copied to clipboard
Repo to analyze population genetic data with many different methods
EEMS-AROUND-THE-WORLD
Goal
This pipeline was built for the Peter et al 2019 manuscript on applying EEMS to a number of human populations and compares the results to PCA on the same datasets. The pipeline share here includes a workflow that comparisonn between several additional methods (listed below).
Reproducing results from Peter et al. 2019
As some of the data used requires permission, we are not free to redistribute it. To re-generate all figures from the paper, it will be necessary to
- acquire access to all data and create the master data set as described in the merge-pipeline
- change paths in
config/config.json
to reflect your working environment - run
snakemake all
Implementation details
Genotypic data is stored in plink format.
Metadata/location data is stored using the
PopGenStructures
data format, with some minor (recommended) changes.
The pipeline is implemented using Snakemake,
using python
for most data wrangling and R
for most plotting