bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

Need help understanding VCF specification to filter VCF file

Open HINOUX opened this issue 4 years ago • 4 comments

This is the command that I used for SNP calling bcftools mpileup -f reference.fa alignments.bam | bcftools call -mv -Ov -o SNP_calling.vcf This is an example of the output Chr1 271 . G GAAATAGCATA,GCATA 999 . INDEL;IDV=3;IMF=1;DP=470;VDB=0.443312;SGB=-75.6152;MQSB=0.00212381;MQ0F=0;ICB=0.0308101;HOB=0.108465;AC=61,21;AN=286;DP4=311,70,89,0 ;MQ=32 GT:PL 0/0:0,48,255,48,255,255 0/1:15,0,8,21,11,27 1/2:60,13,21,39,0,46 0/0:0,48,255,48,255,255 0/0:0,54,255,54,255,255 0/0:0,51,255,51,255,255 0/0:0,27,241,27,241,241 How can I filter on GQ, genotype quality and DP, read depth for each position ? Could you please help me reading this result "0/0:0,27,241,27,241,241" ? Genotype 0/0 : and then ? because it is different from the format GT:GQ:DP:HQ or GT:GQ:DP:AD:PL and the separator is not ":" but "," I would be very grateful if you can help me.

HINOUX avatar Dec 13 '19 17:12 HINOUX

Please read the VCF specification here http://samtools.github.io/hts-specs/VCFv4.3.pdf

Regarding filtering with bcftools, you can use the -i/-e options.

pd3 avatar Dec 28 '19 14:12 pd3

I have a related problem. I generated a bcf file using bam files from 31 samples bcftools mpileup S1sorted.bam S2sorted.bam S3sorted.bam ... S31sorted.bam > myrawbcf then used the following bcftools call -m -v myraw.bcf -o variant.bcf bcftools view -Ou variant.bcf -o variant.vcf In the INFO column -- I am finding fields that are not explained in the VCFv4.3.pdf e..g VDB SGB MQSB MQ0F DP4 RPB MQB BQB ICB HOB AC1 AC2

Is there another samtools/bcftools manual that would explain the above measures?

ewrubin avatar Jul 22 '20 11:07 ewrubin

There is a brief description in the header, but it is a good point this should be documented more. I am reopening the issue and labeling appropriately.

pd3 avatar Oct 13 '20 07:10 pd3