hap.py
hap.py copied to clipboard
what is "BD=N" represent ?
I use hap.py to compare germline variants with engine=vcfeval. I see some variants line with BD=N. what's the meaning of "BD=N"? it's seems that variants called in both TRUTH and QUERY. Thanks! JY
| #CHROM | POS | ID | REF | ALT | QUAL | FILTER | INFO | FORMAT | TRUTH | QUERY |
|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 32552205 | . | CGGACAGCGACGCCACCATCCGGGGCTCCCTGAGCGGGGTGCGGGCGCTGGAACCT | C | . | . | . | GT:BD:BI:BVT:BLT:QQ | 0/1:N:d16_plus:INDEL:het:. | 0/1:N:d16_plus:INDEL:het:0 |
| 6 | 32552206 | . | G | T | . | . | . | GT:BD:BI:BVT:BLT:QQ | 0/1:N:tv:SNP:het:. | 0/1:N:tv:SNP:het:0 |
| 6 | 32552212 | . | C | T | . | . | . | GT:BD:BI:BVT:BLT:QQ | 0/1:N:ti:SNP:het:. | 0/1:N:ti:SNP:het:0 |
See the format definitions:
https://github.com/ga4gh/benchmarking-tools/blob/master/doc/ref-impl/intermediate.md https://github.com/ga4gh/benchmarking-tools/blob/master/doc/ref-impl/outputs.md
BD=N means the variant was not assessed, for example it is outside the evaluation regions, or excluded for other reasons. In the case of vcfeval, it might be in a region containing so many overlapping variants that the search space is too large causing the region to be skipped.