SqueezeMeta
SqueezeMeta copied to clipboard
program stopped abnormally and it does not show the error
The analysis stopped without showing what the problem could be. Could you have a look at the below and advise:
SqueezeMeta v1.5.1, Jan 2022 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN
Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349
Run started Tue Aug 16 23:46:14 2022 in coassembly mode Now creating directories Reading configuration from /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/SqueezeMeta_conf.pl Reading samples from /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/data/00.metaAssembly.samples 12 samples found: SRR12670006 SRR12670014 SRR12670011 SRR12670017 SRR12670012 SRR12670010 SRR12670009 SRR12670015 SRR12670016 SRR12670008 SRR12670007 SRR12670013
Now merging files [10 minutes, 36 seconds]: STEP1 -> RUNNING CO-ASSEMBLY: 01.run_assembly.pl (megahit) Running assembly with megahit Running prinseq (Schmieder et al 2011, Bioinformatics 27(6):863-4) for selecting contigs longer than 200 Renaming contigs Counting length of contigs Contigs stored in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/results/01.metaAssembly.fasta Number of contigs: 437601 [5 hours, 40 minutes, 58 seconds]: STEP2 -> RNA PREDICTION: 02.rnas.pl Running barrnap (Seeman 2014, Bioinformatics 30, 2068-9) for predicting RNAs: Bacteria Archaea Eukaryote Mitochondrial Running RDP classifier (Wang et al 2007, Appl Environ Microbiol 73, 5261-7) Running Aragorn (Laslett & Canback 2004, Nucleic Acids Res 31, 11-16) for tRNA/tmRNA prediction [5 hours, 45 minutes, 9 seconds]: STEP3 -> ORF PREDICTION: 03.run_prodigal.pl Running prodigal (Hyatt et al 2010, BMC Bioinformatics 11: 119) for predicting ORFs ORFs predicted: 563089 [6 hours, 11 minutes, 2 seconds]: STEP4 -> HOMOLOGY SEARCHES: 04.rundiamond.pl taxaStopping in STEP4 -> 04.rundiamond.pl. Program finished abnormally
If you don't know what went wrong or want further advice, please look for similar issues in https://github.com/jtamames/SqueezeMeta/issues Feel free to open a new issue if you don't find the answer there. Please add a brief description of the problem and upload the /cbio/users/javanokendo/sarsCovProject/ art_variantAnalysis/metagenomics/metaAssembly/syslog file (zip it first)
syslog.zip What could have happened here:
Input and filter stats: Input sequences: 437,601 Input bases: 587,772,414 Input mean length: 1343.17 Good sequences: 437,601 (100.00%) Good bases: 587,772,414 Good mean length: 1343.17 Bad sequences: 0 (0.00%) Sequences filtered by specified parameters: none No such file or directory Error: Error calling stat on file /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db/nr.dmnd Error running command: /users/javanokendo/.conda/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metage nomics/metaAssembly/results/03.metaAssembly.faa -p 40 -d /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db/nr.dmnd -e 0.001 --id 50 - f tab -b 8 --quiet -o /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/04.metaAssembly.nr.diamond at /users/javanokendo /.conda/envs/SqueezeMeta/SqueezeMeta/scripts/04.rundiamond.pl line 72. Died at /users/javanokendo/.conda/envs/SqueezeMeta/bin/SqueezeMeta.pl line 1381.
Hello It looks like your database is not properly installed. Did you run test_install.pl to check the installation? Best, J
When I run test_install.pl
I get the following:
module add anaconda3
conda activate SqueezeMeta
Checking the OS
linux OK
Checking that tree is installed
tree --help OK
Checking that ruby is installed
ruby -h OK
Checking that java is installed
java -h OK
Checking that all the required perl libraries are available in this environment
perl -e 'use Term::ANSIColor' OK
perl -e 'use DBI' OK
perl -e 'use DBD::SQLite::Constants' OK
perl -e 'use Time::Seconds' OK
perl -e 'use Tie::IxHash' OK
perl -e 'use Linux::MemInfo' OK
perl -e 'use Getopt::Long' OK
perl -e 'use File::Basename' OK
perl -e 'use DBD::SQLite' OK
perl -e 'use Data::Dumper' OK
perl -e 'use Cwd' OK
perl -e 'use XML::LibXML' OK
perl -e 'use XML::Parser' OK
perl -e 'use Term::ANSIColor' OK
Checking that all the required python libraries are available in this environment
python3 -h OK
python3 -c 'import numpy' OK
python3 -c 'import scipy' OK
python3 -c 'import matplotlib' OK
python3 -c 'import dendropy' OK
python3 -c 'import pysam' OK
python3 -c 'import Bio.Seq' OK
python3 -c 'import pandas' OK
python3 -c 'import sklearn' OK
python3 -c 'import nose' OK
python3 -c 'import cython' OK
python3 -c 'import future' OK
Checking that all the required R libraries are available in this environment
R -h OK
R -e 'library(doMC)' OK
R -e 'library(ggplot2)' OK
R -e 'library(data.table)' OK
R -e 'library(reshape2)' OK
R -e 'library(pathview)' OK
R -e 'library(DASTool)' OK
R -e 'library(SQMtools)' OK
Checking that SqueezeMeta is properly configured... checking database in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db
SqueezeMeta_conf.pl says that databases are located in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db but we can't find nr.db there, or it is corrupted
------------------------------------------------------------------------------
WARNING: Some SqueezeMeta dependencies could not be found in your environment!
SqueezeMeta_conf.pl says that databases are located in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db but we can't find nr.db there, or it is corrupted
That´s it. So please download again the databases and run configure_nodb.pl. Best, J
@jtamames thanks for pointing this. I also tried to unzip nr.gz
but it appears to be corrupted
Yeah, probably your download failed. Please try again
This seems to be the same problem as in #500 and #523, in which make_database.pl
fails due to NCBI servers rejecting/stalling the connection. You can try to download a pre-compiled database with download_databases.pl
instead.
Creating sqlite databases
takes over 100 hrs. Is it normal or it is me getting something wrong?
No, it should be faster. Are you writing the databases into a network-mounted drive?
Yes I work on HPC environment.
On Sat, 27 Aug 2022, 16:55 Fernando Puente-Sánchez, < @.***> wrote:
No, it should be faster. Are you writing the databases into a network-mounted drive?
— Reply to this email directly, view it on GitHub https://github.com/jtamames/SqueezeMeta/issues/527#issuecomment-1229196952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ34O7VARAT5NXM3F3UJGDV3IM6BANCNFSM56YRIW2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
This is an issue that happens in some (not all) HPC environments, when the databases are mounted in a remote drive.
Each SQLite query takes a lot to be processed. You can use download_databases.pl
to avoid having to build your own database, but I think the issue might still appear when running the pipeline (namely, step 06 might be quite slow). If this is the case, maybe you can contact your system administrators, and maybe they can recommend an alternative location for the databases.
What could be causing the following error:
[0 seconds]: STEP17 -> CHECKING BINS: 17.checkM_batch.pl
Evaluating bins with CheckM (Parks et al 2015, Genome Res 25, 1043-55)
Reading /users/javanokendo/.conda/envs/SqueezeMeta/SqueezeMeta/data/alltaxlist.txt
Looking for DAS bins in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/results/bins
9 bins found
Bin 1/9: metabat2.59.fa.contigs.fa.tax
Bin 2/9: maxbin.010.fasta.contigs.fa.tax
Using profile for phylum rank : Chordata
Using profile for domain rank : Eukaryota
Bin 3/9: metabat2.5.fa.contigs.fa.tax
Bin 4/9: metabat2.47.fa.contigs.fa.tax
Bin 5/9: maxbin.004.fasta.contigs.fa.tax
Using profile for phylum rank : Chordata
Using profile for domain rank : Eukaryota
Bin 6/9: maxbin.009.fasta.contigs.fa.tax
Using profile for phylum rank : Chordata
Using profile for domain rank : Eukaryota
Bin 7/9: metabat2.91.fa.contigs.fa.tax
Using profile for phylum rank : Chordata
Using profile for domain rank : Eukaryota
Bin 8/9: metabat2.40.fa.contigs.fa.tax
Bin 9/9: metabat2.33.fa.contigs.fa.tax
Using profile for phylum rank : Chordata
Using profile for domain rank : Eukaryota
Storing results for DAS in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/17.metaAssembly.checkM
Can't find /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/17.metaAssembly.checkM
Stopping in STEP18 -> 17.checkM_batch.pl
All your bins are eukayotic, amd therefore checkM cannot evaluate them. No big deal, restart in step 19 to finish the run
How to I restart it at step 19? Because if I use restart script it fails....could you provide the command to start at a specific step?
On Sun, 4 Sept 2022, 14:35 Javier Tamames, @.***> wrote:
All your bins are eukayotic, amd therefore checkM cannot evaluate them. No big deal, restart in step 19 to finish the run
— Reply to this email directly, view it on GitHub https://github.com/jtamames/SqueezeMeta/issues/527#issuecomment-1236317717, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ34O2FAYO456PGEWMXSW3V4SCQ7ANCNFSM56YRIW2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
As stated in the manual, the -step argument lets you restart at any point.
Closing due to lack of activity, feel free to reopen