bcftools
bcftools copied to clipboard
bcftools gtcheck ignores/skips sites with symbolic ALT alleles
Ran into an issue with bcftools gtcheck
(version 1.12) when comparing two VCFs, one of which has symbolic the ALT allele: <NON_REF>
. The output shows that the number of sites compared is 0
even though matching positions exist in the two VCFs. To investigate, I created a dummy query VCF with one record
Query VCF has the following line:
1 752721 . A G,<NON_REF> 185.44 PASS DP=47;MQ=204.19;FractionInformativeReads=0.979 GT:AD:AF:DP:F1R2:F2R1:GQ:PL:SPL:ICNT:GP:PRI:SB:MB 1/1:0,46,0:1.000,0.000:46:0,24,0:0,22,0:135:223,138,0,1965,138,1965:255,141,0:0,0:1.8544e+02,1.3544e+02,0.0000e+00,4.5000e+02,1.7021e+02,4.5000e+02:0.00,34.77,37.77,34.77,69.54,37.77:0,0,30,16:0,0,28,18
Genotypes VCF has the following line:
1 752721 rs3131972 A G . PASS AL=A/G;ST=+ GT:GC 1/1:0.8366
bcftools
command:
bcftools gtcheck -e 0 --no-HWE-prob -u GT,GT -g genotypes.vcf.gz query.vcf.gz
Output:
#DC [2]Query Sample [3]Genotyped Sample [4]Discordance [5]-log P(HWE) [6]Number of sites compared
DC DUMMYSAMPLE WG0341934-DNAA01_NA12878 0 0.000000e+00 0
Deleting just the symbolic allele in the ALT
field from the query VCF record produces the desired output:
#DC [2]Query Sample [3]Genotyped Sample [4]Discordance [5]-log P(HWE) [6]Number of sites compared
DC DUMMYSAMPLE WG0341934-DNAA01_NA12878 0 0.000000e+00 1