rvtests icon indicating copy to clipboard operation
rvtests copied to clipboard

BGEN file does not work

Open jielab opened this issue 6 years ago • 5 comments

Hi, i converted the UKB chr22 BGEN file to VCF format using PLINK2. I first used PLINK to run association analysis on these two input files to make sure that i got the exact same results.

Then I used RVTESTS to run association analyses, using the BGEN file and the VCF file separately. I used "--dosage DS --impute drop --single score". However, please see the plots before, I found that the EAF, BETA, P between two analyses are totally different. I think I reported this before, now i am using the latest version of RVTESTS.

So, can you please take a look?

best regards, Jie

height rvtests-pgen-rvtests-vcf

jielab avatar Feb 15 '18 15:02 jielab

We will surely take a look. Thanks for the comparisons.

All the best, Dajiang

Assistant Professor Dept. of Public Health Sciences Institute of Personalized Medicine Penn State College of Medicine, HCAR 2020, Mail Stop R125 Email: [email protected] URL: https://dajiangliu.wordpress.com Tel: +1-717-531-4178


From: jiehuang001 [email protected] Sent: Thursday, February 15, 2018 10:14 AM To: zhanxw/rvtests Cc: Subscribed Subject: [zhanxw/rvtests] BGEN file does not work (#55)

Hi, i converted the UKB chr22 BGEN file to VCF format using PLINK2. I first used PLINK to run association analysis on these two input files to make sure that i got the exact same results.

Then I used RVTESTS to run association analyses, using the BGEN file and the VCF file separately. I used "--dosage DS --impute drop --single score". However, please see the plots before, I found that the EAF, BETA, P between two analyses are totally different. I think I reported this before, now i am using the latest version of RVTESTS.

So, can you please take a look?

best regards, Jie

[height rvtests-pgen-rvtests-vcf]https://user-images.githubusercontent.com/26947455/36263783-f5bc014a-1238-11e8-9196-a34b26eb3b4c.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/zhanxw/rvtests/issues/55, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AJohpXAAbjJbOKxGVSPyHlq1Dqfh2sZ-ks5tVEnGgaJpZM4SG_2T.

dajiangliu avatar Feb 16 '18 17:02 dajiangliu

Hi, there is no “DS” available in the bgen file but only “GP” data in the latest UKBB v3 release. Is it possible that you can check your code for calculating the dosage using GP values in bgen file please ? Additionally, is it possible that you can add one more option for VCF using GP to calculate dosage in addition to current “DS” ? Thank you very much.

xchenscd avatar Aug 09 '18 18:08 xchenscd

@xchenscd BGEN does not have DS nor GP if my memory is correct. Internally, the dosage-like genotypes are used for association tests.

Do you mean that calculate dosage from GP field in the VCF file?

zhanxw avatar Dec 06 '18 06:12 zhanxw

@dajiangliu I think we have an answer for @jiehuang001 . Do you remember the answer/solution?

zhanxw avatar Dec 06 '18 06:12 zhanxw

I think that the difference from PLINK can be due to the fact that PLINK uses hard genotype calls as input. If you give it a BGEN, it will first internally convert it to hard genotype calls, which will lead to sizable differences from the analysis that uses dosages. We tried quite a few examples, where we manually calculate the association statistic in R and compare with RVTESTS, it looks concordant. I suspect that the difference is due to use of hard genotype calls and dosage, at least from what we saw. If you have an example that show the difference otherwise, please let us know and we will debug. Thank you!

dajiangliu avatar Dec 07 '18 19:12 dajiangliu