2vcf
2vcf copied to clipboard
Ancestry chr25 (XY) and chr26 (MT) are not converted
Hello... I used the converter on an AncestryDNA raw data file. Conversion stops after chromosome 23 (X) and 24 (Y). Chromosomes 25 (XY) and 26 (MT) are not converted; they do not appear in the vcf file. 23andme2vcf.pl had the chrM markers, 2vcf does not. Just a note, but would appreciate your comments. Thanks.
Thanks for your comment @s-usr . Looking at my own results, I'm seeing the same thing. I'm checking into it. Do you know which version Ancestry results you have?
this is probably related to issue #13
Hi...
AncestryDNA array version: V2.0 Formatted using AncestryDNA converter version: V1.0 Human reference build 37.1 coordinates rsid chromosome position allele1 allele2
I just wanted to elaborate on the structure of the AncestryDNA raw data file. It is lkely that you already have this info. I took the basic test.
Chromsomes 1-22: autosomal Chromsome 23: X - 25250 markers in my raw data file 24: Y - 1668 markers in my raw data file 25: XY - 36 markers in my raw data file 26: MT - 263 markers in my raw data file.
thanks @s-usr . I'm trying to understand what's going on with the non-autosomal markers. There's a serious disconnect between the coordinates that 23andme & Ancestry use for some of these markers vs the ones that are recorded in dbSNP for the same RSID's. For example, 23andme reports this result rs2215794 Y 9900057
, but dbSNP has no record of RSID rs2215794 ever being on chrY, only chr1.
I'm looking into this, but any additional help is appreciated. Ultimately, whatever the outcome is, I'm happy to make changes to the reference or drop any confusing datapoints, I just want to understand for myself what's going on here so I can pick a course of action that benefits 2vcf
users.
Thank you Sir. Also, I noticed in a sample 23andMe raw data file I saw that 23andMe does not report XY (X or Y pseudoautosomal boundary) at all. 23andMe reports:
- Chromsomes 1-22: autosomal
- Chromsome 23: X
- 24: Y
- 25: MT (Mitochondrial).
Is it possible that GRCh38 resolves some of the discrepancies?
ftp://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
Not sure why the link did not work. This one should.
ftp://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz