pheweb icon indicating copy to clipboard operation
pheweb copied to clipboard

pheweb parsing errors

Open jzluo opened this issue 3 years ago • 1 comments

Hi, running into an issue I think with parse-input-files:

Please include:

  • the version of pheweb you're using, gotten from pheweb -h. If you're not on the latest version, consider upgrading with pip3 install --upgrade and trying again. PheWeb 1.3.13
  • the command you were running and its output/error.
$ pheweb phenolist verify &&     pheweb parse-input-files &&     pheweb sites &&     pheweb make-gene-aliases-sqlite3 &&     pheweb add-rsids &&     pheweb add-genes
The 181 phenotypes in ~/pheweb/pheno-list.json' look good.
Processing 181 phenos
Completed  181 tasks in 143 minutes                                      
Working set contains 13 input files and 0 merged files, with 21 tasks in progress and 0 seconds elapsed

Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222
(Details in ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.431163)

It hangs at this point.

$ cat ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.431163 
======= Exception ====
Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222

======= Traceback ====
Traceback (most recent call last):
  File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 131, in main
    run(sys.argv[1:])
  File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 121, in run
    handlers[subcommand](argv[1:])
  File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 62, in f
    module_run(argv)
  File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 63, in run
    manna.apply_ret(ret)
  File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 100, in apply_ret
    raise PheWebError('Child process had exception, info dumped to {}'.format(exc_filepath))
pheweb.utils.PheWebError: Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222

$ cat ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222 
Child process had exception:
   (['6', '351611.0', '6.7', '0.00011'], ['chrom', 'pos', 'ref', 'alt', 'pval', 'beta', 'sebeta', 'maf'])
Traceback:
   Traceback (most recent call last):
     File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 131, in mp_target
       for ret in merge(task['files_to_merge'], task['out_filepath']):
     File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 187, in merge
       new_v = next(readers[reader_id])
     File "/home/jon/.local/lib/python3.7/site-packages/pheweb/file_utils.py", line 169, in _get_variants
       assert len(unparsed_variant) == len(self._all_fields), (unparsed_variant, self._all_fields)
   AssertionError: (['6', '351611.0', '6.7', '0.00011'], ['chrom', 'pos', 'ref', 'alt', 'pval', 'beta', 'sebeta', 'maf'])

  • snippets of relevant files, especially files mentioned in the error.

Snippet of parsed file:

$ zcat ~/pheweb/generated-by-pheweb/parsed/008_52 | grep "6      351611"
6	35161144	G	T	0.026	3.0	1.1	0.00014
6	351611.0	6.7	0.00011
6	3516110.016	.3	0.0	01	0.04

gzip: 008_52: invalid compressed data--format violated

Snippet of input file:

6,331945,C,A,0.7636440987725183,0.000107198,-1.02973,3.42447
6,348277,A,G,0.7509724415781722,0.000260193,-1.01563,3.20025
6,365986,A,G,0.8270841534049443,0.000122692,-1.01508,4.6469
  • your config.py.
hg_build_number = 38
show_manhattan_filter_consequence = True
show_manhattan_filter_button=True 

The parsing otherwise seems to have no problem for most of the file from what I can see

jzluo avatar Jun 08 '21 20:06 jzluo

This is a great bug report, thanks.

Can you show me the line in your input file with 6,351611?

And also the line with 3516110.016? pos is treated as an integer by pheweb, so I don't understand why it wrote out 3516110.016. That's probably related to the problem.

Could you show me the output of zcat ~/pheweb/generated-by-pheweb/parsed/008_52 | grep "6 351611" | hexdump -C? Perhaps there are more tabs in there that didn't survive the copy-paste.

pjvandehaar avatar Jun 09 '21 20:06 pjvandehaar