epa-ng
epa-ng copied to clipboard
Error running this command
Hi there, I am trying to follow this picrust tutorial
After running this command: place_seqs.py -s ../seqs.fna -o out.tre -p 1
--intermediate intermediate/place_seqs
It shows error running this command: epa-ng --tree /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.tre --ref-msa intermediate/place_seqs/ref_seqs_hmmalign.fasta --query intermediate/place_seqs/study_seqs_hmmalign.fasta --chunk-size 5000 -T 1 -m /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.model -w intermediate/place_seqs/epa_out --filter-acc-lwr 0.99 --filter-max 100
Any idea what might be the issue or how to resolve this?
Thanks in advance
Hi @edmundtayzy,
what error does it show? Or does it simply abort?
If it just aborts its likely that you ran out of memory. Try to run the script with a lower chunk_size by adding something like
--chunk_size 500
Hi Pierre,
Thanks for your prompt reply!
I get the following messages below:
epa-ng --tree /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.tre --ref-msa intermediate/place_seqs/ref_seqs_hmmalign.fasta --query intermediate/place_seqs/study_seqs_hmmalign.fasta --chunk-size 5000 -T 1 -m /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.model -w intermediate/place_seqs/epa_out --filter-acc-lwr 0.99 --filter-max 100
Standard output of failed command: ""
Standard error of failed command: "terminate called after throwing an instance of 'std::runtime_error' what(): intermediate/place_seqs/epa_out/epa_info.log already exists! To overwrite existing output files, rerun with --redo "
(Apologies in advance!) I'm still relatively new to the field of bioinformatics, you mentioned to lower chunk size, am I supposed to directly input the command prior? (I have tried so but when I do that I get --chunk_size: command not found) Or am I supposed to add it somewhere into the original input (place_seqs.py -s ../seqs.fna -o out.tre -p 1 --intermediate intermediate/place_seqs)
Thank you so much for your help!
what(): intermediate/place_seqs/epa_out/epa_info.log already exists! To overwrite existing output files, rerun with --redo
there's the problem: your "intermediate" folder already contains results that EPA-ng is protecting you from overwriting. I'm not that familiar with how picrust2 handles things, but I think if you delete the intermediate folder the problem should resolve (of course check first if theres any data in there you might need)
Let me know if it works!
I deleted the intermediate folder and got the following error message instead:
Error running this command: epa-ng --tree /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.tre --ref-msa intermediate/place_seqs/ref_seqs_hmmalign.fasta --query intermediate/place_seqs/study_seqs_hmmalign.fasta --chunk-size 5000 -T 1 -m /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.model -w intermediate/place_seqs/epa_out --filter-acc-lwr 0.99 --filter-max 100
Standard output of failed command:
"INFO Selected: Output dir: intermediate/place_seqs/epa_out/
INFO Selected: Query file: intermediate/place_seqs/study_seqs_hmmalign.fasta
INFO Selected: Tree file: /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.tre
INFO Selected: Reference MSA: intermediate/place_seqs/ref_seqs_hmmalign.fasta
INFO Selected: Filtering by accumulated threshold: 0.99
INFO Selected: Maximum number of placements per query: 100
INFO Selected: Automatic switching of use of per rate scalers
INFO Selected: Preserving the root of the input tree
INFO Selected: Specified model file: /home/tayezy/miniconda3/envs/picrust2/lib/python3.6/site-packages/picrust2/default_files/prokaryotic/pro_ref/pro_ref.model
INFO Rate heterogeneity: GAMMA (4 cats, mean), alpha: 0.453141 (user), weights&rates: (0.25,0.0250674) (0.25,0.220229) (0.25,0.782933) (0.25,2.97177)
Base frequencies (user): 0.229585 0.22008 0.298596 0.251739
Substitution rates (user): 1.00319 2.79077 1.5301 0.87441 3.83966 1
INFO Selected: Reading queries in chunks of: 5000
INFO Selected: Using threads: 1
INFO ______ ____ ___ _ __ ______
/ // __ \ / | / | / // /
/ __/ / // // /| | ______ / |/ // / __
/ / / // ___ |/_____// /| // // /
/_____/// // || // |/ __/ (v0.3.5)
INFO Output file: intermediate/place_seqs/epa_out/epa_result.jplace
"
Standard error of failed command: ""
hmm there doesn't seem to be any error. Maybe now you can try to adjust the chunk size, from what I can tell you can do it with the command I mentioned, passing it to place_seqs.py
Maybe @gavindouglas knows more?
EDIT: you probably also want to post to the picrust github (and reference this issue)