sgn
sgn copied to clipboard
Genetic map positions matching genotyping data download
Feature Idea:
Imputation, prediction of cross variance and mate selection use genetic map coordinates to compute recombination probability matrices.
Ideally, users could download a genotype dataset (VCF or dosage) and obtain centi-morgan positions corresponding to the downloaded positions directly from the database. This could be at the Wizard --> "Download related genotyping data" or on the Manage/Downloads page.
Key issues:
- The physical coordinates referred to in a genetic map must be on the same reference genome version
- Multiple genetic maps exist as a potential source of coordinates: option for users to choose? Option for users to upload their own map coordinates!?
- Genetic map interpolation: Most genetic maps will contain only a subset of the same positions in a genotyping dataset. A simple solution, I've implemented external to the database is interpolation of genetic map coordinates using splines. Providing this interpolation under-the-hood would be enabling to users.
Here's a version of the ICGMC Cassava concensus genetic map, which I believe Guillaume Bauchet originally created, and which I've used to interpolate positions for my marker data in the recent past: https://cassavabase.org/ftp/marnin_datasets/NGC_BigData/CassavaGeneticMap/cassava_cM_pred.v6.allchr.txt