bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

Could bcftools report all states of characters for very low coverage?

Open Axze-rgb opened this issue 3 months ago • 4 comments

Hello,

I'm working with a phased diploid assembly and would like to implement an elementary variant calling strategy. My goal is to make observational calls based on the assumption that most positions should have a coverage depth of 2 and that the alignment is correct. Here's my current approach:

If the reference allele is A, I'd like to report A/T as a 0/1 GT. I'm currently experimenting with the following command:

bcftools mpileup -f ../../ref.fa H5A2.dual.sorted.bam | bcftools call -m -A

My initial observations suggest it might work as I want. But I fear some under-the-hood filtering due to the low coverage.

Can you confirm whether my understanding of this command's behavior is correct, given my intended goal?

I asked about this on Biostars, but no one came back to me after 20 days.

Thanks so much for any help.

Axze-rgb avatar Mar 26 '24 17:03 Axze-rgb

When the -A option is added and -v is not used, this tells the call command to keep all sites, including non-variant sites, and to keep all observed alleles in the ALT column, even if the genotype is not recognized as variant. So the genotype can be 0/0, but the ALT column will still carry an evidence for the other observed alleles. Adding mpileup -a FORMAT/AD will also record the number of reads supporting each allele.

pd3 avatar Mar 27 '24 15:03 pd3

Thanks for your answer; I am encountering an "extreme value encountered."

bcftools mpileup -f ../../ref.fa -a FORMAT/AD H5A2.dual.sorted.bam|bcftools call -m -A -v > H5A2.pileup
[mpileup] 1 sample in 1 input file
Note: none of --samples-file, --ploidy or --ploidy-file given, assuming all sites are diploid
[mpileup] maximum number of reads per input file set to -d 250
[W::vcf_parse_format_fill5] Extreme FORMAT/AD value encountered and set to missing at Chrom_3:144644
[1]    2315323 killed     bcftools mpileup -f ../../ref.fa -a FORMAT/AD H5A2.dual.sorted.bam | 
       2315324 done       bcftools call -m -A -v > H5A2.pileup

When looking into Tablet for that position, I see nothing special, it's just the two contigs mapping there.

image

I don't understand what "extreme" is there.

Axze-rgb avatar Mar 27 '24 17:03 Axze-rgb

Is bcftools not suitable for my application? If not, do you have any alternative suggestions? EDIT: it seems a good old samtools mpileup could actually give me what I want. Thanks

Axze-rgb avatar Mar 29 '24 15:03 Axze-rgb

Any chance you could provide a small test case to reproduce the problem? It is concerning that bcftools mpileup should produce a VCF that cannot be parsed by bcftools call

[W::vcf_parse_format_fill5] Extreme FORMAT/AD value encountered and set to missing at Chrom_3:144644

pd3 avatar Apr 02 '24 15:04 pd3