mag icon indicating copy to clipboard operation
mag copied to clipboard

Cannot reach https://busco-data.ezlab.org/v5/data/file_versions.tsv

Open ChristophKnapp opened this issue 3 years ago • 15 comments

Description of the bug

Hello, When I start nf-core-mag it runs for some time and then stops with

ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.

See the attached error log. Busco already has an fixed issue with this problem (https://gitlab.com/ezlab/busco/-/issues/567). That's why I post it here first. Tell me to go away if you think they should reopen this issue.

I tried to access https://busco-data.ezlab.org/v5/data/file_versions.tsv with wget and curl and had no problem downloading it from the machine this runs on. Therefore I don't think this is a firewall issue of some sort, but I could be wrong. After all I don't know the exact method how busco is trying this.

I also thought at first that this might be just an internet hickup. So I resumed the analysis after testing whether I could download this file. This was not the case, this will occur every time I resume.

Thanks for your help

Christoph

Command used and terminal output

nextflow run nf-core/mag -profile conda --input '../data/*_R{1,2}.fastq.gz' --outdir results -r fix-convert-depths-gzip -resume
N E X T F L O W  ~  version 22.04.5
Launching `https://github.com/nf-core/mag` [elated_stonebraker] DSL2 - revision: 1b4456d542 [fix-convert-depths-gzip]


------------------------------------------------------
                                        ,--./,-.
        ___     __   __   __   ___     /,-._.--~'
  |\ | |__  __ /  ` /  \ |__) |__         }  {
  | \| |       \__, \__/ |  \ |___     \`-._,-`-,
                                        `._,._,'
  nf-core/mag v2.3.0dev
------------------------------------------------------
Core Nextflow options
  revision        : fix-convert-depths-gzip
  runName         : elated_stonebraker
  launchDir       : /media/NGS/nf-core-workflow
  workDir         : /media/NGS/nf-core-workflow/work
  projectDir      : /home/hummelchen/.nextflow/assets/nf-core/mag
  userName        : hummelchen
  profile         : conda
  configFiles     : /home/hummelchen/.nextflow/assets/nf-core/mag/nextflow.config

Input/output options
  input           : ../data/*_R{1,2}.fastq.gz
  outdir          : results

Generic options
  enable_conda    : true

Quality control for short reads options
  phix_reference  : /home/hummelchen/.nextflow/assets/nf-core/mag/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz

Quality control for long reads options
  lambda_reference: /home/hummelchen/.nextflow/assets/nf-core/mag/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz

Taxonomic profiling options
  gtdb            : https://data.ace.uq.edu.au/public/gtdb/data/releases/release202/202.0/auxillary_files/gtdbtk_r202_data.tar.gz

!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/mag for your analysis please cite:

* The pipeline publication
  https://doi.org/10.1093/nargab/lqac007

* The pipeline
  https://doi.org/10.5281/zenodo.3589527

* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x

* Software dependencies
  https://github.com/nf-core/mag/blob/master/CITATIONS.md
------------------------------------------------------
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[0c/ee7390] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.13.fa)                                                 [  0%] 0 of 106
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[c2/76795c] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)                                                 [  0%] 1 of 106, failed: 1
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Execution cancelled -- Finishing pending tasks before exit
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to failed placements! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err. Still using results for selected generic lineage dataset."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	Placements failed" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
      message=$(egrep $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log)
      [[ $message =~ INFO:[[:space:]]([_[:alnum:]]+)[[:space:]]selected ]];
      db_name_gen="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_gen}
      echo "Used generic lineage dataset: ${db_name_gen}"
      cp BUSCO/auto_lineage/run_${db_name_gen}/short_summary.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
          break
      done
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
          break
executor >  local (7)
[47/9be65c] process > NFCORE_MAG:MAG:FASTQC_RAW (NG-30689_QN1_4_3_lib613328_10075_2)                                                                            [100%] 1 of 1, cached: 1 ✔
[3d/396de6] process > NFCORE_MAG:MAG:FASTP (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                 [100%] 1 of 1, cached: 1 ✔
[ac/adbb55] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_BUILD (GCA_002596845.1_ASM259684v1_genomic.fna.gz)                                                    [100%] 1 of 1, cached: 1 ✔
[16/88f95a] process > NFCORE_MAG:MAG:BOWTIE2_PHIX_REMOVAL_ALIGN (NG-30689_QN1_4_3_lib613328_10075_2)                                                            [100%] 1 of 1, cached: 1 ✔
[9b/ec6fb9] process > NFCORE_MAG:MAG:FASTQC_TRIMMED (NG-30689_QN1_4_3_lib613328_10075_2)                                                                        [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_RAW                                                                                                               -
[-        ] process > NFCORE_MAG:MAG:PORECHOP                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOLYSE                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:FILTLONG                                                                                                                   -
[-        ] process > NFCORE_MAG:MAG:NANOPLOT_FILTERED                                                                                                          -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE_DB_PREPARATION                                                                                                  -
[-        ] process > NFCORE_MAG:MAG:CENTRIFUGE                                                                                                                 -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2_DB_PREPARATION                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:KRAKEN2                                                                                                                    -
[37/8a2ffc] process > NFCORE_MAG:MAG:MEGAHIT (NG-30689_QN1_4_3_lib613328_10075_2)                                                                               [100%] 1 of 1, cached: 1 ✔
[8a/bf0dd1] process > NFCORE_MAG:MAG:SPADES (NG-30689_QN1_4_3_lib613328_10075_2)                                                                                [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:SPADESHYBRID                                                                                                               -
[3c/1903eb] process > NFCORE_MAG:MAG:QUAST (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                                         [100%] 2 of 2, cached: 2 ✔
[6b/450699] process > NFCORE_MAG:MAG:PRODIGAL (NG-30689_QN1_4_3_lib613328_10075_2)                                                                              [100%] 2 of 2, cached: 2 ✔
[bd/0fff10] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_BUILD (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                    [100%] 2 of 2, cached: 2 ✔
[ff/266e2f] process > NFCORE_MAG:MAG:BINNING_PREPARATION:BOWTIE2_ASSEMBLY_ALIGN (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2-NG-30689_QN1_4_3_lib613328_10075_2) [100%] 2 of 2, cached: 2 ✔
[cd/528041] process > NFCORE_MAG:MAG:BINNING:METABAT2_JGISUMMARIZEBAMCONTIGDEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                          [100%] 2 of 2, cached: 2 ✔
[e7/d37f31] process > NFCORE_MAG:MAG:BINNING:CONVERT_DEPTHS (NG-30689_QN1_4_3_lib613328_10075_2)                                                                [100%] 2 of 2, cached: 2 ✔
[87/0a8ee1] process > NFCORE_MAG:MAG:BINNING:METABAT2_METABAT2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 2 of 2, cached: 2 ✔
[54/c0b9eb] process > NFCORE_MAG:MAG:BINNING:MAXBIN2 (NG-30689_QN1_4_3_lib613328_10075_2)                                                                       [100%] 2 of 2, cached: 2 ✔
[2f/482f33] process > NFCORE_MAG:MAG:BINNING:ADJUST_MAXBIN2_EXT (MEGAHIT-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 2 of 2, cached: 2 ✔
[f9/a7820f] process > NFCORE_MAG:MAG:BINNING:SPLIT_FASTA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                   [100%] 4 of 4, cached: 4 ✔
[af/95fa5c] process > NFCORE_MAG:MAG:BINNING:GUNZIP_BINS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.023.fa.gz)                                         [100%] 106 of 106, cached: 106 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:GUNZIP_UNBINS                                                                                                      -
[b3/6f4c47] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                    [100%] 4 of 4, cached: 4 ✔
[-        ] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_PLOT                                                                                                    -
[2f/5d9547] process > NFCORE_MAG:MAG:BINNING:MAG_DEPTHS_SUMMARY                                                                                                 [100%] 1 of 1, cached: 1 ✔
[0c/ee7390] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.13.fa)                                                 [  1%] 1 of 100, failed: 1
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_PLOT                                                                                                        -
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                                                                     -
[1f/0f9fa9] process > NFCORE_MAG:MAG:QUAST_BINS (SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2)                                                             [100%] 4 of 4, cached: 4 ✔
[04/96c715] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                                                                         [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:CAT                                                                                                                        -
[d3/503ebe] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_DB_PREPARATION (gtdbtk_r202_data.tar.gz)                                                                     [100%] 1 of 1, cached: 1 ✔
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFY                                                                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                                                                      -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                                                                -
[37/3b84e6] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.017)                                                            [ 94%] 100 of 106, cached: 100
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                                                                -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                                                                    -
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/mag] Pipeline completed with errors-
WARN: Killing running tasks (6)
Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa)` terminated with an error exit status (1)

Command executed:

  # ensure augustus has write access to config directory
  if [ N = "Y" ] ; then
      cp -r /usr/local/config/ augustus_config/
      export AUGUSTUS_CONFIG_PATH=augustus_config
  fi
  
  # place db in extra folder to ensure BUSCO recognizes it as path (instead of downloading it)
  if [ N = "Y" ] ; then
      mkdir dataset
      mv  dataset/
  fi
  
  # set nullgob: if pattern matches no files, expand to a null string rather than to itself
  shopt -s nullglob
  
  # only used for saving busco downloads
  most_spec_db="NA"
  
  if busco --auto-lineage         --mode genome         --in MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa         --cpu "8"         --out "BUSCO" > MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log 2> MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err; then
  
      # get name of used specific lineage dataset
      summaries=(BUSCO/short_summary.specific.*.BUSCO.txt)
      if [ ${#summaries[@]} -ne 1 ]; then
          echo "ERROR: none or multiple 'BUSCO/short_summary.specific.*.BUSCO.txt' files found. Expected one."
          exit 1
      fi
      [[ $summaries =~ BUSCO/short_summary.specific.(.*).BUSCO.txt ]];
      db_name_spec="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_spec}
      echo "Used specific lineage dataset: ${db_name_spec}"
  
      if [ N = "Y" ]; then
          cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          # if lineage dataset is provided, BUSCO analysis does not fail in case no genes can be found as when using the auto selection setting
          # report bin as failed to allow consistent warnings within the pipeline for both settings
          if egrep -q $'WARNING:	BUSCO did not find any match.' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "WARNING: BUSCO could not find any genes for the provided lineage dataset! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log."
              echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
          fi
      else
          # auto lineage selection
          if { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Lineage \S+ is selected, supported by ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; } ||                 { egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	The results from the Prodigal gene predictor indicate that your data belongs to the mollicutes clade. Testing subclades...' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log                 && egrep -q $'INFO:	Using local lineages directory ' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; }; then
              # the second statement is necessary, because certain mollicute clades use a different genetic code, are not part of the BUSCO placement tree, are tested separately
              # and cause different log messages
              echo "Domain and specific lineage could be selected by BUSCO."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
              db_name_gen=""
              summaries_gen=(BUSCO/short_summary.generic.*.BUSCO.txt)
              if [ ${#summaries_gen[@]} -lt 1 ]; then
                  echo "No 'BUSCO/short_summary.generic.*.BUSCO.txt' file found. Assuming selected domain and specific lineages are the same."
                  cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
                  db_name_gen=${db_name_spec}
              else
                  [[ $summaries_gen =~ BUSCO/short_summary.generic.(.*).BUSCO.txt ]];
                  db_name_gen="${BASH_REMATCH[1]}"
                  echo "Used generic lineage dataset: ${db_name_gen}"
                  cp BUSCO/short_summary.generic.${db_name_gen}.BUSCO.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
              fi
  
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
                  break
              done
              for f in BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
                  cat BUSCO/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
                  break
              done
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Not enough markers were placed on the tree \([0-9]*\). Root lineage \S+ is kept' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              echo "Domain could be selected by BUSCO, but no more specific lineage."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.domain.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
          elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'INFO:	Running virus detection pipeline' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log ; then
              # TODO double-check if selected dataset is not one of bacteria_*, archaea_*, eukaryota_*?
              echo "Domain could not be selected by BUSCO, but virus dataset was selected."
              cp BUSCO/short_summary.specific.${db_name_spec}.BUSCO.txt short_summary.specific_lineage.${db_name_spec}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
          else
              echo "ERROR: Some not expected case occurred! See MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log." >&2
              exit 1
          fi
      fi
  
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.faa.gz
          break
      done
      for f in BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/run_${db_name_spec}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_spec}.fna.gz
          break
      done
  
  elif egrep -q $'ERROR:	No genes were recognized by BUSCO' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to no recognized genes! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	No genes" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
  elif egrep -q $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log && egrep -q $'ERROR:	Placements failed' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err ; then
      echo "WARNING: BUSCO analysis failed due to failed placements! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err. Still using results for selected generic lineage dataset."
      echo -e "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa	Placements failed" > "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.failed_bin.txt"
  
      message=$(egrep $'INFO:	\S+ selected' MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log)
      [[ $message =~ INFO:[[:space:]]([_[:alnum:]]+)[[:space:]]selected ]];
      db_name_gen="${BASH_REMATCH[1]}"
      most_spec_db=${db_name_gen}
      echo "Used generic lineage dataset: ${db_name_gen}"
      cp BUSCO/auto_lineage/run_${db_name_gen}/short_summary.txt short_summary.domain.${db_name_gen}.MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa.txt
  
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*faa | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.faa.gz
          break
      done
      for f in BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna; do
          cat BUSCO/auto_lineage/run_${db_name_gen}/busco_sequences/single_copy_busco_sequences/*fna | gzip >MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_buscos.${db_name_gen}.fna.gz
          break
      done
  
  else
      echo "ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err." >&2
      exit 1
  fi
  
  # additionally output genes predicted with Prodigal (GFF3)
  if [ -f BUSCO/logs/prodigal_out.log ]; then
      mv BUSCO/logs/prodigal_out.log "MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_prodigal.gff"
  fi
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:BUSCO_QC:BUSCO":
      python: $(python --version 2>&1 | sed 's/Python //g')
      R: $(R --version 2>&1 | sed -n 1p | sed 's/R version //' | sed 's/ (.*//')
      busco: $(busco --version 2>&1 | sed 's/BUSCO //g')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.

Work dir:
  /media/NGS/nf-core-workflow/work/c2/76795ccd4c946124b7723c02666717

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`


Join mismatch for the following entries: 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.19.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.10.fa values= 
- key=MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.012.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.25.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa values= 
- key=SPAdes-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.004.fa values= 
- key=MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.9.fa values= 
- key=MEGAHIT-MaxBin2-NG-30689_QN1_4_3_lib613328_10075_2.003.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.7.fa values= 
- key=SPAdes-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.11.fa values=
(more omitted)

Relevant files

MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.log MEGAHIT-MetaBAT2-NG-30689_QN1_4_3_lib613328_10075_2.18.fa_busco.err.txt

System information

N E X T F L O W ~ version 22.04.5 nf-core/mag v2.2.0 Container engine: conda OS: Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster

Hardware: desktop with 128 Gb RAM and 32 cores

ChristophKnapp avatar Aug 25 '22 13:08 ChristophKnapp

This seems bad. Could you additionally try using --busco_reference or --busco_download_path. That would mean having the files locally and therefore omitting any downloading step.

d4straub avatar Aug 25 '22 14:08 d4straub

Also, please do not use -r fix-convert-depths-gzip but -r 2.2.1 ;)

d4straub avatar Aug 25 '22 14:08 d4straub

I have seen the same error, even when specifying either (--busco_reference "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz") or (--busco_download_path "path/to/bacteria_odb10)

jboktor avatar Aug 25 '22 19:08 jboktor

I am facing the same issue as well

nayeimkhan avatar Aug 25 '22 22:08 nayeimkhan

Also, please do not use -r fix-convert-depths-gzip but -r 2.2.1 ;)

Hi, I was told to use this flags by @jfy133 because of issue #327.

I will try to resume the analysis after I upgraded to the latest versions with the suggested flags and report the results.

ChristophKnapp avatar Aug 26 '22 04:08 ChristophKnapp

As @jboktor I can confirm that using --busco_reference or -busco_download_path does not change the outcome.

ChristophKnapp avatar Aug 26 '22 07:08 ChristophKnapp

Hi, I had a similar problem recently. In my case though it was solvable using -resume multiple times, it only occurred in some BUSCO processes and seemed that the download issue was not reproducible. After a while it worked again, thus I didn't dig deeper. However, I am a bit confused why the same problem occurs when using --busco_download_path, since this is used in combination with the --offline parameter. I can have a look at this next week again.

skrakau avatar Aug 26 '22 10:08 skrakau

@skrakau I think thats because --busco_download_path refers to the directory where the busco lineage files are located. It fails to retrieve https://busco-data.ezlab.org/v5/data/file_versions.tsv, which is not among the lineage files. Please correct me if I'm wrong.

Regards

ChristophKnapp avatar Aug 26 '22 11:08 ChristophKnapp

Hi @ChristophKnapp , yes it refers to the directory containing among others a folder with the lineage files, but this should or could also contain a file_versions.tsv file. The BUSCO user guide says one should download all files from https://busco-data.ezlab.org/v5/data/, which contains a file_versions.tsv file. (Although the example 'valid download folder' doesn't contain this file, but I guess then BUSCO would need to download it. Maybe this would need a bit more documentation for this pipeline.)

The nf-core/mag parameter --busco_download_path causes BUSCO to be run with the BUSCO parameters --offline --download_path <...>, see https://github.com/nf-core/mag/blob/a8e92af70eca59a92b72262e6cdde11e69375801/modules/local/busco.nf#L42 which should prevent BUSCO from trying to download anything. That's why I was confused that it still tries to download the file_versions.tsv file, but if the file is missing it probably makes sense that BUSCO fails.

skrakau avatar Aug 26 '22 14:08 skrakau

Remains the question why the download of the file fails, thus talking to the BUSCO developers might be good anyway. If you create an issue, could you link this here? Otherwise I could also do it next week.

skrakau avatar Aug 26 '22 14:08 skrakau

Otherwise I could also do it next week.

@skrakau, I would prefer if you would do it. You have more insight in what is going on and understand better on how busco is integrated.

Thank you

Christoph

ChristophKnapp avatar Aug 29 '22 11:08 ChristophKnapp

I opened an issue: https://gitlab.com/ezlab/busco/-/issues/593

Feel free to add further details, in case I forgot something.

skrakau avatar Aug 31 '22 13:08 skrakau

Apparently there was a rate limit on the BUSCO server introduced a while ago, which probably caused problems in particular when multiple BUSCO processes were running in parallel and which explains why wget works without problems. This rate limit will be increased. We need to check if this will be sufficient for now. So @ChristophKnapp and @nayeimkhan, let us know if this helps.

Independently of this, we should update BUSCO to version 5.4.x at some point, which contains a failsafe mechanism that reattempts a connection in case of failure.

skrakau avatar Sep 01 '22 06:09 skrakau

hi @skrakau , the fix works. Thanks!

nayeimkhan avatar Sep 07 '22 15:09 nayeimkhan

FYI (maybe that will help someone with similar issue): I ran into same problem. I am running the pipeline with AWS Batch. I tried --busco_download_path pointing to the local folder with manually unpacked data (as instructed) and for some reason pipeline freeze (with no error, just dead) showing inactive busco process:

process > NFCORE_MAG:MAG:METABAT2_BINNING:MAG_DEPTHS_SUMMARY                   [100%] 1 of 1, cached: 1✔
process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (SPAdes-B220601001.49.fa)              -

What helped in my case was combination of both:

  • changing container to quay.io/biocontainers/busco:5.4.3--pyhdfd78af_0 (in busco.nf)
  • providing reference --busco_reference "https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2020-03-06.tar.gz"

bmlab-sg avatar Sep 30 '22 16:09 bmlab-sg

I will close this issue, as the original download issue due to the rate limit was fixed. Feel free to open a new issue if similar issues occur again.

@bmlab-sg if your issue remains or re-occurs, please open as well a new separate issue.

skrakau avatar Mar 02 '23 10:03 skrakau

Hi @skrakau and @jfy133

I've just run into this old issue now, with version 2.5.4 of the pipeline. My nf-core/mag command specifies the BUSCO DB as such:

--busco_db https://busco-data.ezlab.org/v5/data/lineages/bacteria_odb10.2024-01-08.tar.gz

It's probably a similar issue with multiple BUSCO jobs attempting to access the URL, and their server blocking new connections after a while:

[4d/3595c3] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971107.64.fa)                          [ 26%] 551 of 2078, failed: 1

amizeranschi avatar May 28 '24 04:05 amizeranschi

I guess the only solution here is to download the database manually I guess :/, and pass that to the pipeline instead

jfy133 avatar May 28 '24 04:05 jfy133

Weirdly enough, I tried this now and STILL get the same error. To be more specific, I downloaded the archive with wget and I am running nf-core/mag with the options --busco_db bacteria_odb10.2024-01-08.tar.gz and resume. I get the following in the output:

[f3/5ae729] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.67.fa)                          [  4%] 103 of 2078, failed: 4
[-        ] process > NFCORE_MAG:MAG:BUSCO_QC:BUSCO_SUMMARY                                                       -
[8c/69840e] process > NFCORE_MAG:MAG:QUAST_BINS (MEGAHIT-MetaBAT2-unclassified-unrefined-SRR16971104)             [100%] 7 of 7 ✔
[-        ] process > NFCORE_MAG:MAG:QUAST_BINS_SUMMARY                                                           -
[-        ] process > NFCORE_MAG:MAG:CAT                                                                          -
[-        ] process > NFCORE_MAG:MAG:CAT_SUMMARY                                                                  -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_CLASSIFYWF                                                     -
[-        ] process > NFCORE_MAG:MAG:GTDBTK:GTDBTK_SUMMARY                                                        -
[-        ] process > NFCORE_MAG:MAG:BIN_SUMMARY                                                                  -
[3b/f2dd56] process > NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-SRR16971104.441)                                    [ 99%] 2076 of 2078, cached: 701
[-        ] process > NFCORE_MAG:MAG:CUSTOM_DUMPSOFTWAREVERSIONS                                                  -
[-        ] process > NFCORE_MAG:MAG:MULTIQC                                                                      -
ERROR ~ Error executing process > 'NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.34.fa)'

Caused by:
  Process `NFCORE_MAG:MAG:BUSCO_QC:BUSCO (MEGAHIT-MetaBAT2-SRR16971103.34.fa)` terminated with an error exit status (1)

Command executed:

  run_busco.sh "--lineage_dataset dataset/bacteria_odb10" "Y" "bacteria_odb10" "MEGAHIT-MetaBAT2-SRR16971103.34.fa" 8 "Y" "N"
  most_spec_db=$(<info_most_spec_db.txt)
  
  cat <<-END_VERSIONS > versions.yml
  "NFCORE_MAG:MAG:BUSCO_QC:BUSCO":
      python: $(python --version 2>&1 | sed 's/Python //g')
      R: $(R --version 2>&1 | sed -n 1p | sed 's/R version //' | sed 's/ (.*//')
      busco: $(busco --version 2>&1 | sed 's/BUSCO //g')
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  ERROR: BUSCO analysis failed for some unknown reason! See also MEGAHIT-MetaBAT2-SRR16971103.34.fa_busco.err.

Work dir:
  /data/share/horia-banciu/work/f3/602e4961f0a35ee674d094dc7b6626

And this is the contents of MEGAHIT-MetaBAT2-SRR16971103.34.fa_busco.err:

2024-05-28 05:14:55 ERROR:	Cannot reach https://busco-data2.ezlab.org/v5/data/file_versions.tsv
2024-05-28 05:14:55 ERROR:	BUSCO analysis failed!
2024-05-28 05:14:55 ERROR:	Check the logs, read the user guide (https://busco.ezlab.org/busco_userguide.html), and check the BUSCO issue board on https://gitlab.com/ezlab/busco/issues

Why is BUSCO still trying to access https://busco-data2.ezlab.org/v5/data/file_versions.tsv, when I'm running the pipeline with a local database?

I'm attaching the full log file, in case it helps: nextflow-busco-url-error.log.txt

amizeranschi avatar May 28 '24 06:05 amizeranschi

Ugh that looks bad... Maybe it always does an internet look up?

I've not actually used busco Manually myself... @skrakau if you remember, do you have any ideas?

jfy133 avatar May 28 '24 07:05 jfy133

Facing the exact same issue, currently testing the --offline flag to see if we can force it to not do an internet lookup.

b-kolar avatar Jun 14 '24 19:06 b-kolar

Please let me know if it works @b-kolar - I started investigating this yesterday at the airport but couldn't finish before had to fly. Otherwise I'll get back to this on Thursday

jfy133 avatar Jun 15 '24 14:06 jfy133

I can confirm that the --offline flag works with Busco!

We are testing a modified version of the mag pipeline now, which has so far passed the Busco steps without issues.

b-kolar avatar Jun 17 '24 19:06 b-kolar

Thank you @b-kolar ! I might ping you when my implementation is ready to make sure we added it roughly in the same way, if that's ok ?

jfy133 avatar Jun 18 '24 04:06 jfy133

@jfy133 No problem, feel free to send any questions my way!

b-kolar avatar Jun 18 '24 18:06 b-kolar

Should be fixed here! @b-kolar could you test -r busco-offline? https://github.com/nf-core/mag/pull/633

jfy133 avatar Jun 27 '24 13:06 jfy133