plink-ng icon indicating copy to clipboard operation
plink-ng copied to clipboard

read haploid dosages with pgenlib

Open 23andme-jaredo opened this issue 2 years ago • 5 comments

Is it possible to read haploid dosages with pgenlib.PgenReader?

thanks,

Jared

23andme-jaredo avatar Jan 18 '23 17:01 23andme-jaredo

As with the plink .bed format, haploid vs. diploid is not directly encoded in the .pgen. Instead, plink and plink2 divide the encoded values by two when the .bim/.pvar (and on chrX, .fam/.psam) file indicates that we're dealing with haploid data.

chrchang avatar Jan 18 '23 18:01 chrchang

hmmm so I am a bit confused. I have imputed data converted from bcf via:

plink2 --bcf $bcf dosage=HDS --make-pfile

and I can see that the two haploid dosages per individual are stored because I can recover them via:

 plink2 --pfile plink2 --export vcf bgz vcf-dosage=HDS

so I am try to extract those HDS values via pgenlib

23andme-jaredo avatar Jan 18 '23 18:01 23andme-jaredo

Maybe I wasn't clear that I meant imputed haploid/phased probabilities, not hard genotypes.

23andme-jaredo avatar Jan 18 '23 18:01 23andme-jaredo

Oh, sorry, I thought you were referring to e.g. chrX/chrY/chrM.

The PgrGetDp() function in pgenlib_read.h is the simplest one that can return biallelic phased dosages.

chrchang avatar Jan 18 '23 19:01 chrchang

Thanks! We'll try exposing that in python.

23andme-jaredo avatar Jan 18 '23 19:01 23andme-jaredo