TBProfiler icon indicating copy to clipboard operation
TBProfiler copied to clipboard

Issue with Pathogen Profiler combine_vcf_variants.py script

Open taranewman opened this issue 2 months ago • 9 comments

Hello,

I came across an error with the Pathogen Profiler combine_vcf_variants.py script that seems to occur in approximately half of my samples with TBProfiler v6.2.0. The same samples previously ran successfully using v4.3.0. The samples causing this error don't appear to have a clear lineage/QC pattern.

An SRA sample that produces this error is SRR10869015

If line 171 is commented out, then everything appears to run fine.

System specifications: conda, Linux HPC, SLURM

Command Failed:
/bin/bash -c set -o pipefail; bcftools view -c 1 -a <>targets_for_profile.vcf.gz | bcftools view -v snps | combine_vcf_variants.py --ref tbprofiler//tbdb.fasta --gff tbdb.gff --bam <>.bam |  snpEff ann -dataDir snpeff-5.2-0/data -noLog -noStats Mycobacterium_tuberculosis_h37rv -  | bcftools sort -Oz -o <>.vcf.gz && bcftools index <>vcf.gz
stderr:
Writing to ...
Traceback (most recent call last):
  File "<path to conda env> bin/combine_vcf_variants.py", line 171, in <module>
    variant.info.update({'AF':count/dp})
  File "pysam/libcbcf.pyx", line 2798, in pysam.libcbcf.VariantRecordInfo.update
  File "pysam/libcbcf.pyx", line 2621, in pysam.libcbcf.VariantRecordInfo.__setitem__
  File "pysam/libcbcf.pyx", line 698, in pysam.libcbcf.bcf_info_set_value
KeyError: 'unknown INFO: AF'
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from -
Cleaning

taranewman avatar Apr 23 '24 22:04 taranewman