bcftools
bcftools copied to clipboard
Could bcftools report all states of characters for very low coverage?
Hello,
I'm working with a phased diploid assembly and would like to implement an elementary variant calling strategy. My goal is to make observational calls based on the assumption that most positions should have a coverage depth of 2 and that the alignment is correct. Here's my current approach:
If the reference allele is A, I'd like to report A/T as a 0/1 GT. I'm currently experimenting with the following command:
bcftools mpileup -f ../../ref.fa H5A2.dual.sorted.bam | bcftools call -m -A
My initial observations suggest it might work as I want. But I fear some under-the-hood filtering due to the low coverage.
Can you confirm whether my understanding of this command's behavior is correct, given my intended goal?
I asked about this on Biostars, but no one came back to me after 20 days.
Thanks so much for any help.
When the -A
option is added and -v
is not used, this tells the call
command to keep all sites, including non-variant sites, and to keep all observed alleles in the ALT column, even if the genotype is not recognized as variant. So the genotype can be 0/0
, but the ALT column will still carry an evidence for the other observed alleles. Adding mpileup -a FORMAT/AD
will also record the number of reads supporting each allele.
Thanks for your answer; I am encountering an "extreme value encountered."
bcftools mpileup -f ../../ref.fa -a FORMAT/AD H5A2.dual.sorted.bam|bcftools call -m -A -v > H5A2.pileup
[mpileup] 1 sample in 1 input file
Note: none of --samples-file, --ploidy or --ploidy-file given, assuming all sites are diploid
[mpileup] maximum number of reads per input file set to -d 250
[W::vcf_parse_format_fill5] Extreme FORMAT/AD value encountered and set to missing at Chrom_3:144644
[1] 2315323 killed bcftools mpileup -f ../../ref.fa -a FORMAT/AD H5A2.dual.sorted.bam |
2315324 done bcftools call -m -A -v > H5A2.pileup
When looking into Tablet for that position, I see nothing special, it's just the two contigs mapping there.
I don't understand what "extreme" is there.
Is bcftools not suitable for my application? If not, do you have any alternative suggestions? EDIT: it seems a good old samtools mpileup could actually give me what I want. Thanks
Any chance you could provide a small test case to reproduce the problem? It is concerning that bcftools mpileup
should produce a VCF that cannot be parsed by bcftools call
[W::vcf_parse_format_fill5] Extreme FORMAT/AD value encountered and set to missing at Chrom_3:144644