PyVCF icon indicating copy to clipboard operation
PyVCF copied to clipboard

'nul' in INFO value with type float causes error in Reader

Open myourshaw opened this issue 6 years ago • 0 comments

In a VCF file created by the GATK pipeline, there can be various statistics in the INFO field that have a value of nul. See the attached example VCF.

For example (there are other fields with the same issue): This item:

##INFO=<ID=AS_BaseQRankSum,Number=A,Type=Float,Description="allele specific Z-score from Wilcoxon rank sum test of each Alt Vs. Ref base qualities">

might appear as:

chr1 857100 rs2905036 C T 100406.89 PASS AC=2;AF=1.00;AN=2;AS_BaseQRankSum=nul;AS_FS=0.000; ...

The error is:

Traceback (most recent call last):
  File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1664, in <module>
    main()
  File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1658, in main
    globals = debugger.run(setup['file'], None, None, is_module)
  File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/pydevd.py", line 1068, in run
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/yoursham/apps/pycharm-2018.1.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/mnt/hdd/germline/applications/scripts/vep_vcf_reader.py", line 28, in <module>
    for record in vcf_reader:
  File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 572, in __next__
    info = self._parse_info(row[7])
  File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 397, in _parse_info
    val = self._map(float, vals)
  File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 360, in _map
    for x in iterable]
  File "/home/yoursham/apps/python-3.7/lib/python3.7/site-packages/vcf/parser.py", line 360, in <listcomp>
    for x in iterable]
ValueError: could not convert string to float: 'nul'

The code that causes the error is in vcf/parser.py at line 357-360 in the Reader class:

    def _map(self, func, iterable, bad='.'):
        '''``map``, but make bad values None.'''
        return [func(x) if x != bad else None
                for x in iterable]

A quick-n-dirty patch is:

    def _map(self, func, iterable, bad='.'):
        '''``map``, but make bad values None.'''
        return [func(x) if x != bad and not (func==float and x=='nul') else None
                for x in iterable]

ValExome0011-nul-issue-example.vcf.tar.gz

myourshaw avatar Jul 13 '18 00:07 myourshaw