TBProfiler
TBProfiler copied to clipboard
Issue with Pathogen Profiler combine_vcf_variants.py script
Hello,
I came across an error with the Pathogen Profiler combine_vcf_variants.py script that seems to occur in approximately half of my samples with TBProfiler v6.2.0. The same samples previously ran successfully using v4.3.0. The samples causing this error don't appear to have a clear lineage/QC pattern.
An SRA sample that produces this error is SRR10869015
If line 171 is commented out, then everything appears to run fine.
System specifications: conda, Linux HPC, SLURM
Command Failed:
/bin/bash -c set -o pipefail; bcftools view -c 1 -a <>targets_for_profile.vcf.gz | bcftools view -v snps | combine_vcf_variants.py --ref tbprofiler//tbdb.fasta --gff tbdb.gff --bam <>.bam | snpEff ann -dataDir snpeff-5.2-0/data -noLog -noStats Mycobacterium_tuberculosis_h37rv - | bcftools sort -Oz -o <>.vcf.gz && bcftools index <>vcf.gz
stderr:
Writing to ...
Traceback (most recent call last):
File "<path to conda env> bin/combine_vcf_variants.py", line 171, in <module>
variant.info.update({'AF':count/dp})
File "pysam/libcbcf.pyx", line 2798, in pysam.libcbcf.VariantRecordInfo.update
File "pysam/libcbcf.pyx", line 2621, in pysam.libcbcf.VariantRecordInfo.__setitem__
File "pysam/libcbcf.pyx", line 698, in pysam.libcbcf.bcf_info_set_value
KeyError: 'unknown INFO: AF'
[E::bcf_hdr_read] Input is not detected as bcf or vcf format
Could not read VCF/BCF headers from -
Cleaning