2vcf icon indicating copy to clipboard operation
2vcf copied to clipboard

Ancestry chr25 (XY) and chr26 (MT) are not converted

Open s-usr opened this issue 4 years ago • 8 comments

Hello... I used the converter on an AncestryDNA raw data file. Conversion stops after chromosome 23 (X) and 24 (Y). Chromosomes 25 (XY) and 26 (MT) are not converted; they do not appear in the vcf file. 23andme2vcf.pl had the chrM markers, 2vcf does not. Just a note, but would appreciate your comments. Thanks.

s-usr avatar Aug 31 '19 15:08 s-usr

Thanks for your comment @s-usr . Looking at my own results, I'm seeing the same thing. I'm checking into it. Do you know which version Ancestry results you have?

plantimals avatar Aug 31 '19 15:08 plantimals

this is probably related to issue #13

plantimals avatar Aug 31 '19 15:08 plantimals

Hi...

AncestryDNA array version: V2.0 Formatted using AncestryDNA converter version: V1.0 Human reference build 37.1 coordinates rsid chromosome position allele1 allele2

s-usr avatar Aug 31 '19 16:08 s-usr

I just wanted to elaborate on the structure of the AncestryDNA raw data file. It is lkely that you already have this info. I took the basic test.

Chromsomes 1-22: autosomal Chromsome 23: X - 25250 markers in my raw data file 24: Y - 1668 markers in my raw data file 25: XY - 36 markers in my raw data file 26: MT - 263 markers in my raw data file.

s-usr avatar Aug 31 '19 20:08 s-usr

thanks @s-usr . I'm trying to understand what's going on with the non-autosomal markers. There's a serious disconnect between the coordinates that 23andme & Ancestry use for some of these markers vs the ones that are recorded in dbSNP for the same RSID's. For example, 23andme reports this result rs2215794 Y 9900057, but dbSNP has no record of RSID rs2215794 ever being on chrY, only chr1.

I'm looking into this, but any additional help is appreciated. Ultimately, whatever the outcome is, I'm happy to make changes to the reference or drop any confusing datapoints, I just want to understand for myself what's going on here so I can pick a course of action that benefits 2vcf users.

plantimals avatar Sep 01 '19 19:09 plantimals

Thank you Sir. Also, I noticed in a sample 23andMe raw data file I saw that 23andMe does not report XY (X or Y pseudoautosomal boundary) at all. 23andMe reports:

  • Chromsomes 1-22: autosomal
  • Chromsome 23: X
  • 24: Y
  • 25: MT (Mitochondrial).

s-usr avatar Sep 02 '19 19:09 s-usr

Is it possible that GRCh38 resolves some of the discrepancies?

ftp://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

s-usr avatar Sep 02 '19 19:09 s-usr

Not sure why the link did not work. This one should.

ftp://ftp.ensembl.org/pub/release-97/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

s-usr avatar Sep 02 '19 19:09 s-usr