hts-specs
hts-specs copied to clipboard
Revision of SB INFO Number and type
#290 fixed #189 which requested the definition of the SB INFO field based on the observation that GATK defines SB as Number=4,Type=Integer (https://github.com/samtools/hts-specs/issues/189#issue-209808608). However, that definition is for the SB per sample, that is for SB FORMAT field. All VCFs in the gatk repository are consistent with that:
find . -type f -name "*.vcf" -exec grep -h "^##FORMAT=<ID=SB," {} \; | sort | uniq -c
225 ##FORMAT=<ID=SB,Number=4,Type=Integer,Description="Per-sample component statistics which comprise the Fisher's Exact Test to detect strand bias.">
In the case of SB in the INFO field, it uses Number=1 and Type=Float as shown below
find . -type f -name "*.vcf" -exec grep -h "^##INFO=<ID=SB," {} \; | sort | uniq -c
41 ##INFO=<ID=SB,Number=1,Type=Float,Description="Strand bias">
21 ##INFO=<ID=SB,Number=1,Type=Float,Description="Strand Bias">
I think the decision in #290 and #189 should be revised.