bcftools icon indicating copy to clipboard operation
bcftools copied to clipboard

bcftools view losing results when widening the range.

Open MurphyDavid opened this issue 3 years ago • 1 comments

Using the latest release version of bcftools-1.12 I've been hitting an issue where it seems to lose entries when widening a query range

/data/bcftools-1.12/bcftools view /mnt/results/pipeline/sample/sample.g.vcf.gz -r chr17:150000-170000 -O v


##contig=<ID=HLA-DRB1*15:03:01:01,length=11567,assembly=Homo_sapiens_assembly38.index>
##contig=<ID=HLA-DRB1*15:03:01:02,length=11569,assembly=Homo_sapiens_assembly38.index>
##contig=<ID=HLA-DRB1*16:02:01,length=11005,assembly=Homo_sapiens_assembly38.index>
##source=HaplotypeCaller
##bcftools_viewVersion=1.12+htslib-
##bcftools_viewCommand=view -r chr17:150000-170000 -O v /mnt/results/pipeline/sample/sample.g.vcf.gz; Date=Wed Jun 16 12:43:47 2021
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample
chr17   155883  .       C       <NON_REF>       .       .       END=156518      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161291  .       C       <NON_REF>       .       .       END=161661      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161912  .       T       <NON_REF>       .       .       END=162399      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   163779  .       A       <NON_REF>       .       .       END=164497      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0

/data/NGS_Software/bcftools-1.12/bcftools view /mnt/results/pipeline/sample/sample.g.vcf.gz -r chr17:140000-170000 -O v


##contig=<ID=HLA-DRB1*15:03:01:01,length=11567,assembly=Homo_sapiens_assembly38.index>
##contig=<ID=HLA-DRB1*15:03:01:02,length=11569,assembly=Homo_sapiens_assembly38.index>
##contig=<ID=HLA-DRB1*16:02:01,length=11005,assembly=Homo_sapiens_assembly38.index>
##source=HaplotypeCaller
##bcftools_viewVersion=1.12+htslib-
##bcftools_viewCommand=view -r chr17:140000-170000 -O v /mnt/results/pipeline/sample/sample.g.vcf.gz; Date=Wed Jun 16 12:44:23 2021
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample
chr17   163779  .       A       <NON_REF>       .       .       END=164497      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0


On another machine with bcftools 1.9 that has the same folder mounted I get what I believe to be the correct result.

/Software/NGS_Software/bcftools-1.9/bcftools/bcftools view /mnt/results/pipeline/sample/sample.g.vcf.gz -r chr17:0-170000 -O v

    
##bcftools_viewCommand=view -r chr17:0-170000 -O v /mnt/results/pipeline/sample/sample.g.vcf.gz; Date=Wed Jun 16 13:45:44 2021
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  sample
chr17   155883  .       C       <NON_REF>       .       .       END=156518      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161291  .       C       <NON_REF>       .       .       END=161661      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161912  .       T       <NON_REF>       .       .       END=162399      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   163779  .       A       <NON_REF>       .       .       END=164497      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0

I have re-indexed the gvcf file just to be sure it wasn't an indexing error.

gatk-4.1.4.0/gatk IndexFeatureFile -F sample.g.vcf.gz

It hasn't changed the behaviour.

extracting the relevant lines with zcat yields this:


 zcat /mnt/results/pipeline/sample/sample.g.vcf.gz  | head -n 7494428 | tail -n 20
chr16   90175398        .       G       <NON_REF>       .       .       END=90175410    GT:DP:GQ:MIN_DP:PL      0/0:4:6:2:0,6,82
chr16   90175411        .       G       <NON_REF>       .       .       END=90175423    GT:DP:GQ:MIN_DP:PL      0/0:2:3:1:0,3,39
chr16   90175424        .       C       <NON_REF>       .       .       END=90175427    GT:DP:GQ:MIN_DP:PL      0/0:2:6:2:0,6,88
chr16   90175428        .       C       <NON_REF>       .       .       END=90175514    GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,37
chr16   90175515        .       A       <NON_REF>       .       .       END=90175615    GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr16   90175616        .       A       <NON_REF>       .       .       END=90175626    GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,37
chr16   90177585        .       T       <NON_REF>       .       .       END=90177964    GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr16   90185840        .       A       <NON_REF>       .       .       END=90186294    GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr16   90222129        .       A       <NON_REF>       .       .       END=90222626    GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   155883  .       C       <NON_REF>       .       .       END=156518      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161291  .       C       <NON_REF>       .       .       END=161661      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   161912  .       T       <NON_REF>       .       .       END=162399      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   163779  .       A       <NON_REF>       .       .       END=164497      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   172191  .       A       <NON_REF>       .       .       END=172347      GT:DP:GQ:MIN_DP:PL      0/0:0:0:0:0,0,0
chr17   172348  .       G       <NON_REF>       .       .       END=172356      GT:DP:GQ:MIN_DP:PL      0/0:1:3:1:0,3,38
chr17   172357  .       T       <NON_REF>       .       .       END=172373      GT:DP:GQ:MIN_DP:PL      0/0:2:6:2:0,6,76
chr17   172374  .       G       <NON_REF>       .       .       END=172375      GT:DP:GQ:MIN_DP:PL      0/0:3:9:3:0,9,128
chr17   172376  .       A       <NON_REF>       .       .       END=172386      GT:DP:GQ:MIN_DP:PL      0/0:4:12:4:0,12,154
chr17   172387  .       A       <NON_REF>       .       .       END=172401      GT:DP:GQ:MIN_DP:PL      0/0:5:15:5:0,15,195
chr17   172402  .       C       <NON_REF>       .       .       END=172411      GT:DP:GQ:MIN_DP:PL      0/0:6:18:6:0,18,198

this behaviour also appears to be present in bcftools 1.10.2

MurphyDavid avatar Jun 16 '21 13:06 MurphyDavid

Can you please index with bcftools index instead and try again? If the problem persists, could you please provide a small test case, including your index? In my tests I was not able to reproduce the problem. I am assuming your bcftools and htslib are from the same release.

pd3 avatar Jun 17 '21 06:06 pd3