SqueezeMeta icon indicating copy to clipboard operation
SqueezeMeta copied to clipboard

program stopped abnormally and it does not show the error

Open Jokendo-collab opened this issue 2 years ago • 15 comments

The analysis stopped without showing what the problem could be. Could you have a look at the below and advise:

SqueezeMeta v1.5.1, Jan 2022 - (c) J. Tamames, F. Puente-Sánchez CNB-CSIC, Madrid, SPAIN

Please cite: Tamames & Puente-Sanchez, Frontiers in Microbiology 9, 3349 (2019). doi: https://doi.org/10.3389/fmicb.2018.03349

Run started Tue Aug 16 23:46:14 2022 in coassembly mode Now creating directories Reading configuration from /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/SqueezeMeta_conf.pl Reading samples from /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/data/00.metaAssembly.samples 12 samples found: SRR12670006 SRR12670014 SRR12670011 SRR12670017 SRR12670012 SRR12670010 SRR12670009 SRR12670015 SRR12670016 SRR12670008 SRR12670007 SRR12670013

Now merging files [10 minutes, 36 seconds]: STEP1 -> RUNNING CO-ASSEMBLY: 01.run_assembly.pl (megahit) Running assembly with megahit Running prinseq (Schmieder et al 2011, Bioinformatics 27(6):863-4) for selecting contigs longer than 200 Renaming contigs Counting length of contigs Contigs stored in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/results/01.metaAssembly.fasta Number of contigs: 437601 [5 hours, 40 minutes, 58 seconds]: STEP2 -> RNA PREDICTION: 02.rnas.pl Running barrnap (Seeman 2014, Bioinformatics 30, 2068-9) for predicting RNAs: Bacteria Archaea Eukaryote Mitochondrial Running RDP classifier (Wang et al 2007, Appl Environ Microbiol 73, 5261-7) Running Aragorn (Laslett & Canback 2004, Nucleic Acids Res 31, 11-16) for tRNA/tmRNA prediction [5 hours, 45 minutes, 9 seconds]: STEP3 -> ORF PREDICTION: 03.run_prodigal.pl Running prodigal (Hyatt et al 2010, BMC Bioinformatics 11: 119) for predicting ORFs ORFs predicted: 563089 [6 hours, 11 minutes, 2 seconds]: STEP4 -> HOMOLOGY SEARCHES: 04.rundiamond.pl taxaStopping in STEP4 -> 04.rundiamond.pl. Program finished abnormally

If you don't know what went wrong or want further advice, please look for similar issues in https://github.com/jtamames/SqueezeMeta/issues Feel free to open a new issue if you don't find the answer there. Please add a brief description of the problem and upload the /cbio/users/javanokendo/sarsCovProject/ art_variantAnalysis/metagenomics/metaAssembly/syslog file (zip it first)

Jokendo-collab avatar Aug 17 '22 08:08 Jokendo-collab

syslog.zip What could have happened here:

Input and filter stats: Input sequences: 437,601 Input bases: 587,772,414 Input mean length: 1343.17 Good sequences: 437,601 (100.00%) Good bases: 587,772,414 Good mean length: 1343.17 Bad sequences: 0 (0.00%) Sequences filtered by specified parameters: none No such file or directory Error: Error calling stat on file /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db/nr.dmnd Error running command: /users/javanokendo/.conda/envs/SqueezeMeta/SqueezeMeta/bin/diamond blastp -q /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metage nomics/metaAssembly/results/03.metaAssembly.faa -p 40 -d /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db/nr.dmnd -e 0.001 --id 50 - f tab -b 8 --quiet -o /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/04.metaAssembly.nr.diamond at /users/javanokendo /.conda/envs/SqueezeMeta/SqueezeMeta/scripts/04.rundiamond.pl line 72. Died at /users/javanokendo/.conda/envs/SqueezeMeta/bin/SqueezeMeta.pl line 1381.

Jokendo-collab avatar Aug 17 '22 08:08 Jokendo-collab

Hello It looks like your database is not properly installed. Did you run test_install.pl to check the installation? Best, J

jtamames avatar Aug 22 '22 08:08 jtamames

When I run test_install.pl I get the following:

module add anaconda3
conda activate SqueezeMeta

Checking the OS
        linux OK

Checking that tree is installed
        tree --help OK

Checking that ruby is installed
        ruby -h OK

Checking that java is installed
        java -h OK

Checking that all the required perl libraries are available in this environment
        perl -e 'use Term::ANSIColor' OK
        perl -e 'use DBI' OK
        perl -e 'use DBD::SQLite::Constants' OK
        perl -e 'use Time::Seconds' OK
        perl -e 'use Tie::IxHash' OK
        perl -e 'use Linux::MemInfo' OK
        perl -e 'use Getopt::Long' OK
        perl -e 'use File::Basename' OK
        perl -e 'use DBD::SQLite' OK
        perl -e 'use Data::Dumper' OK
        perl -e 'use Cwd' OK
        perl -e 'use XML::LibXML' OK
        perl -e 'use XML::Parser' OK
        perl -e 'use Term::ANSIColor' OK
Checking that all the required python libraries are available in this environment
        python3 -h OK
        python3 -c 'import numpy' OK
        python3 -c 'import scipy' OK
        python3 -c 'import matplotlib' OK
        python3 -c 'import dendropy' OK
        python3 -c 'import pysam' OK
        python3 -c 'import Bio.Seq' OK
        python3 -c 'import pandas' OK
	python3 -c 'import sklearn' OK
        python3 -c 'import nose' OK
        python3 -c 'import cython' OK
        python3 -c 'import future' OK
Checking that all the required R libraries are available in this environment
        R -h OK
        R -e 'library(doMC)' OK
        R -e 'library(ggplot2)' OK
        R -e 'library(data.table)' OK
        R -e 'library(reshape2)' OK
        R -e 'library(pathview)' OK
        R -e 'library(DASTool)' OK
        R -e 'library(SQMtools)' OK

Checking that SqueezeMeta is properly configured... checking database in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db
        SqueezeMeta_conf.pl says that databases are located in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db but we can't find nr.db there, or it is corrupted

------------------------------------------------------------------------------

WARNING: Some SqueezeMeta dependencies could not be found in your environment!
        SqueezeMeta_conf.pl says that databases are located in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/databases/db but we can't find nr.db there, or it is corrupted

Jokendo-collab avatar Aug 22 '22 08:08 Jokendo-collab

That´s it. So please download again the databases and run configure_nodb.pl. Best, J

jtamames avatar Aug 22 '22 08:08 jtamames

@jtamames thanks for pointing this. I also tried to unzip nr.gz but it appears to be corrupted

Jokendo-collab avatar Aug 22 '22 08:08 Jokendo-collab

Yeah, probably your download failed. Please try again

jtamames avatar Aug 22 '22 08:08 jtamames

This seems to be the same problem as in #500 and #523, in which make_database.pl fails due to NCBI servers rejecting/stalling the connection. You can try to download a pre-compiled database with download_databases.pl instead.

fpusan avatar Aug 22 '22 08:08 fpusan

Creating sqlite databases takes over 100 hrs. Is it normal or it is me getting something wrong?

Jokendo-collab avatar Aug 27 '22 13:08 Jokendo-collab

No, it should be faster. Are you writing the databases into a network-mounted drive?

fpusan avatar Aug 27 '22 13:08 fpusan

Yes I work on HPC environment.

On Sat, 27 Aug 2022, 16:55 Fernando Puente-Sánchez, < @.***> wrote:

No, it should be faster. Are you writing the databases into a network-mounted drive?

— Reply to this email directly, view it on GitHub https://github.com/jtamames/SqueezeMeta/issues/527#issuecomment-1229196952, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ34O7VARAT5NXM3F3UJGDV3IM6BANCNFSM56YRIW2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jokendo-collab avatar Aug 27 '22 15:08 Jokendo-collab

This is an issue that happens in some (not all) HPC environments, when the databases are mounted in a remote drive. Each SQLite query takes a lot to be processed. You can use download_databases.pl to avoid having to build your own database, but I think the issue might still appear when running the pipeline (namely, step 06 might be quite slow). If this is the case, maybe you can contact your system administrators, and maybe they can recommend an alternative location for the databases.

fpusan avatar Aug 28 '22 07:08 fpusan

What could be causing the following error:

[0 seconds]: STEP17 -> CHECKING BINS: 17.checkM_batch.pl
  Evaluating bins with CheckM (Parks et al 2015, Genome Res 25, 1043-55)

  Reading /users/javanokendo/.conda/envs/SqueezeMeta/SqueezeMeta/data/alltaxlist.txt
  Looking for DAS bins in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/results/bins
  9 bins found

  Bin 1/9: metabat2.59.fa.contigs.fa.tax
  Bin 2/9: maxbin.010.fasta.contigs.fa.tax
  Using profile for phylum rank : Chordata
  Using profile for domain rank : Eukaryota
  Bin 3/9: metabat2.5.fa.contigs.fa.tax
  Bin 4/9: metabat2.47.fa.contigs.fa.tax
  Bin 5/9: maxbin.004.fasta.contigs.fa.tax
  Using profile for phylum rank : Chordata
  Using profile for domain rank : Eukaryota
  Bin 6/9: maxbin.009.fasta.contigs.fa.tax
  Using profile for phylum rank : Chordata
  Using profile for domain rank : Eukaryota
  Bin 7/9: metabat2.91.fa.contigs.fa.tax
  Using profile for phylum rank : Chordata
  Using profile for domain rank : Eukaryota
  Bin 8/9: metabat2.40.fa.contigs.fa.tax
  Bin 9/9: metabat2.33.fa.contigs.fa.tax
  Using profile for phylum rank : Chordata
  Using profile for domain rank : Eukaryota

  Storing results for DAS in /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/17.metaAssembly.checkM
Can't find /cbio/users/javanokendo/sarsCovProject/art_variantAnalysis/metagenomics/metaAssembly/intermediate/17.metaAssembly.checkM
Stopping in STEP18 -> 17.checkM_batch.pl

Jokendo-collab avatar Sep 03 '22 20:09 Jokendo-collab

All your bins are eukayotic, amd therefore checkM cannot evaluate them. No big deal, restart in step 19 to finish the run

jtamames avatar Sep 04 '22 11:09 jtamames

How to I restart it at step 19? Because if I use restart script it fails....could you provide the command to start at a specific step?

On Sun, 4 Sept 2022, 14:35 Javier Tamames, @.***> wrote:

All your bins are eukayotic, amd therefore checkM cannot evaluate them. No big deal, restart in step 19 to finish the run

— Reply to this email directly, view it on GitHub https://github.com/jtamames/SqueezeMeta/issues/527#issuecomment-1236317717, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGJ34O2FAYO456PGEWMXSW3V4SCQ7ANCNFSM56YRIW2A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Jokendo-collab avatar Sep 04 '22 12:09 Jokendo-collab

As stated in the manual, the -step argument lets you restart at any point.

jtamames avatar Sep 04 '22 12:09 jtamames

Closing due to lack of activity, feel free to reopen

fpusan avatar Oct 26 '22 11:10 fpusan