PyVCF
PyVCF copied to clipboard
'nul' in INFO value with type float causes error in Reader
In a VCF file created by the GATK pipeline, there can be various statistics in the INFO field that have a value of nul
. See the attached example VCF.
For example (there are other fields with the same issue): This item:
##INFO=<ID=AS_BaseQRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt Vs. Ref base qualities">
might appear as:
chr1 857100 rs2905036 C T 100406.89 PASS AC=2;AF=1.00;AN=2;AS_BaseQRankSum=nul;AS_FS=0.000; ...
The error is:
Traceback (most recent call last):
File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1664, in <module>
main()
File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1658, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1068, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/mnt/hdd/germline/applications/scripts/vep_vcf_reader.py", line 28, in <module>
for record in vcf_reader:
File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 572, in __next__
info = self._parse_info(row[7])
File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 397, in _parse_info
val = self._map(float, vals)
File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 360, in _map
for x in iterable]
File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 360, in <listcomp>
for x in iterable]
ValueError: could not convert string to float: 'nul'
The code that causes the error is in vcf/parser.py
at line 357-360 in the Reader class:
def _map(self, func, iterable, bad='.'):
'''``map``, but make bad values None.'''
return [func(x) if x != bad else None
for x in iterable]
A quick-n-dirty patch is:
def _map(self, func, iterable, bad='.'):
'''``map``, but make bad values None.'''
return [func(x) if x != bad and not (func==float and x=='nul') else None
for x in iterable]