gemini icon indicating copy to clipboard operation
gemini copied to clipboard

Gemini not found error while loading chunks from VCF

Open frankMusacchia opened this issue 3 years ago • 4 comments

Hello, I am trying to use gemini. After installing all needed dependencies I had a successful installation. I am using a large VCF WGS file (~12GB) with ~1000 of samples The first time I used the "load" function, gemini complained that yaml.load was deprecated and I got an error:

/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/config.py:61: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details. config = yaml.load(in_handle)

CADD scores are not being loaded because the annotation file could not be found. Run gemini update --dataonly --extra cadd_score to install the annotation file.

GERP per bp is not being loaded because the annotation file could not be found. Run gemini update --dataonly --extra gerp_bp to install the annotation file.

Loading 1714760 variants. Breaking /media/francesco/I/gemini/ppmi.july2018.chr14.vqsr.norm.vcf.gz into 12 chunks. Loading chunk 0. Loading chunk 1. /bin/sh: 1: gemini: not found Loading chunk 2. /bin/sh: 1: gemini: not found Loading chunk 3. /bin/sh: 1: gemini: not found Loading chunk 4. /bin/sh: 1: gemini: not found Loading chunk 5. /bin/sh: 1: gemini: not found Loading chunk 6. /bin/sh: 1: gemini: not found Loading chunk 7. /bin/sh: 1: gemini: not found Loading chunk 8. /bin/sh: 1: gemini: not found Loading chunk 9. /bin/sh: 1: gemini: not found Loading chunk 10. /bin/sh: 1: gemini: not found Loading chunk 11. /bin/sh: 1: gemini: not found /bin/sh: 1: gemini: not found Traceback (most recent call last): File "/home/francesco/bin/gemini_tools/bin/gemini", line 7, in gemini_main.main() File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1249, in main args.func(parser, args) File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 204, in load_fn gemini_load.load(parser, args) File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 49, in load load_multicore(args) File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 93, in load_multicore chunks = load_chunks_multicore(grabix_file, args) File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 264, in load_chunks_multicore wait_until_finished(procs) File "/home/francesco/bin/gemini_data/anaconda/lib/python2.7/site-packages/gemini/gemini_load.py", line 359, in wait_until_finished raise ValueError("Processing failed on GEMINI chunk load") ValueError: Processing failed on GEMINI chunk load

I went to: https://github.com/yaml/pyyaml/wiki/PyYAML-yaml.load(input)-Deprecation and I chose to modify the code to safe_yaml

But I still get the same error. Can you please tell me if there is a known solution to this? Thanks Francesco

frankMusacchia avatar Jan 21 '22 08:01 frankMusacchia

Hi, how are you installing gemini? What is the path of the code you modified?

You could try to install an older version of pyyaml.

brentp avatar Jan 21 '22 08:01 brentp

Hi, I have executed the instructions at: https://gemini.readthedocs.io/en/latest/

and installed for example grabix, vt and snpeff that were required

frankMusacchia avatar Jan 21 '22 08:01 frankMusacchia

well, the most critical error is that it's not finding gemini when you run the load. what does: env | grep -i python show?

Also, let's get your gemini install working, but you might consider using slivar (https://github.com/brentp/slivar), especially for a cohort this size.

brentp avatar Jan 21 '22 08:01 brentp

Yes, the cohort is large and I was thinking if gemini would work on that. I will try slivar then.. thank you

frankMusacchia avatar Jan 21 '22 08:01 frankMusacchia