sgn icon indicating copy to clipboard operation
sgn copied to clipboard

Genetic map positions matching genotyping data download

Open wolfemd opened this issue 3 years ago • 0 comments

Feature Idea:

Imputation, prediction of cross variance and mate selection use genetic map coordinates to compute recombination probability matrices.

Ideally, users could download a genotype dataset (VCF or dosage) and obtain centi-morgan positions corresponding to the downloaded positions directly from the database. This could be at the Wizard --> "Download related genotyping data" or on the Manage/Downloads page.

Key issues:

  1. The physical coordinates referred to in a genetic map must be on the same reference genome version
  2. Multiple genetic maps exist as a potential source of coordinates: option for users to choose? Option for users to upload their own map coordinates!?
  3. Genetic map interpolation: Most genetic maps will contain only a subset of the same positions in a genotyping dataset. A simple solution, I've implemented external to the database is interpolation of genetic map coordinates using splines. Providing this interpolation under-the-hood would be enabling to users.

Here's a version of the ICGMC Cassava concensus genetic map, which I believe Guillaume Bauchet originally created, and which I've used to interpolate positions for my marker data in the recent past: https://cassavabase.org/ftp/marnin_datasets/NGC_BigData/CassavaGeneticMap/cassava_cM_pred.v6.allchr.txt

wolfemd avatar Jan 14 '22 19:01 wolfemd