mag icon indicating copy to clipboard operation
mag copied to clipboard

Error in executing PROKKA

Open anugos opened this issue 3 months ago • 4 comments

nf-core/mag v2.5.4-ge486bb2 Run Name: bq-mag19 nf-core/mag execution completed unsuccessfully! The exit status of the task that caused the workflow execution to fail was: 2.

The full error message was:

Error executing process > 'NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14)'

Caused by: Process NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14) terminated with an error exit status (2)

Command executed:

prokka
--metagenome
--cpus 2
--prefix MEGAHIT-MetaBAT2-group-10.14


MEGAHIT-MetaBAT2-group-10.14.fa

cat <<-END_VERSIONS > versions.yml "NFCORE_MAG:MAG:PROKKA": prokka: $(echo $(prokka --version 2>&1) | sed 's/^.*prokka //') END_VERSIONS

Command exit status: 2

Command output: (empty)

Command error: [21:24:50] Determined blastp version is 002012 from 'blastp: 2.12.0+' [21:24:50] Looking for 'cmpress' - found /usr/local/bin/cmpress [21:24:50] Determined cmpress version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)' [21:24:50] Looking for 'cmscan' - found /usr/local/bin/cmscan [21:24:50] Determined cmscan version is 001001 from '# INFERNAL 1.1.4 (Dec 2020)' [21:24:50] Looking for 'egrep' - found /bin/egrep [21:24:50] Looking for 'find' - found /usr/bin/find [21:24:50] Looking for 'grep' - found /bin/grep [21:24:50] Looking for 'hmmpress' - found /usr/local/bin/hmmpress [21:24:50] Determined hmmpress version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/' [21:24:50] Looking for 'hmmscan' - found /usr/local/bin/hmmscan [21:24:50] Determined hmmscan version is 003003 from '# HMMER 3.3.2 (Nov 2020); http://hmmer.org/' [21:24:50] Looking for 'java' - found /usr/local/bin/java [21:24:50] Looking for 'makeblastdb' - found /usr/local/bin/makeblastdb [21:24:50] Determined makeblastdb version is 002012 from 'makeblastdb: 2.12.0+' [21:24:50] Looking for 'minced' - found /usr/local/bin/minced [21:24:50] Determined minced version is 004002 from 'minced 0.4.2' [21:24:50] Looking for 'parallel' - found /usr/local/bin/parallel [21:24:50] Determined parallel version is 20220222 from 'GNU parallel 20220222' [21:24:50] Looking for 'prodigal' - found /usr/local/bin/prodigal [21:24:50] Determined prodigal version is 002006 from 'Prodigal V2.6.3: February, 2016' [21:24:50] Looking for 'prokka-genbank_to_fasta_db' - found /usr/local/bin/prokka-genbank_to_fasta_db [21:24:50] Looking for 'sed' - found /bin/sed [21:24:50] Looking for 'tbl2asn' - found /usr/local/bin/tbl2asn [21:24:51] Determined tbl2asn version is 025007 from 'tbl2asn 25.7 arguments:' [21:24:51] Using genetic code table 11. [21:24:51] Loading and checking input file: MEGAHIT-MetaBAT2-group-10.14.fa [21:24:51] Wrote 65 contigs totalling 205963 bp. [21:24:51] Predicting tRNAs and tmRNAs [21:24:51] Running: aragorn -l -gc11 -w MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.fna [21:24:51] Found 0 tRNAs [21:24:51] Predicting Ribosomal RNAs [21:24:51] Running Barrnap with 2 threads [21:24:51] Found 0 rRNAs [21:24:51] Skipping ncRNA search, enable with --rfam if desired. [21:24:51] Total of 0 tRNA + rRNA features [21:24:51] Searching for CRISPR repeats [21:24:51] Found 0 CRISPRs [21:24:51] Predicting coding sequences [21:24:51] Contigs total 205963 bp, so using meta mode [21:24:51] Running: prodigal -i MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.fna -c -m -g 11 -p meta -f sco -q [21:24:52] Found 226 CDS [21:24:52] Connecting features back to sequences [21:24:52] Not using genus-specific database. Try --usegenus to enable it. [21:24:52] Annotating CDS, please be patient. [21:24:52] Will use 2 CPUs for similarity searching. [21:24:52] There are still 226 unannotated CDS left (started with 226) [21:24:52] Will use blast to search against /usr/local/db/kingdom/Bacteria/IS with 2 CPUs [21:24:52] Running: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null [21:24:53] Could not run command: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null

Work dir: /data/user/anugos24/Black-Queen-analysis/Shotgun-Metagenome/redo_results_2024/work/d8/6fda25e960ff9bd0d71920b903df93

Tip: view the complete command output by changing to the process work dir and entering the command cat .command.out The workflow was completed at 2024-03-12T21:29:39.822735-05:00 (duration: 5m 5s)

The command used to launch the workflow was as follows:

nextflow run nf-core/mag -r 2.5.4 -name bq-mag19 -profile singularity -params-file nf-params.json -c custom.config -resume bq-mag18 Pipeline Configuration: revision 2.5.4 runName bq-mag19 containerEngine singularity container [PROKKA:https://depot.galaxyproject.org/singularity/prokka:1.14.6--pl5321hdfd78af_4]

profile singularity configFiles

phix_reference /home/anugos24/.nextflow/assets/nf-core/mag/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz lambda_reference /home/anugos24/.nextflow/assets/nf-core/mag/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz kraken2_db /data/user/database/minikraken_8GB_202003.tgz skip_krona true gtdbtk_min_perc_aa 10 gtdbtk_pplacer_cpus 1 coassemble_group true megahit_options --presets meta-large skip_spades true skip_spadeshybrid true skip_prodigal true skip_metaeuk true skip_maxbin2 true skip_concoct true bowtie2_mode --very-sensitive save_assembly_mapped_reads true busco_db /data/user/bacteria_odb10.2020-03-06.tar.gz busco_auto_lineage_prok true busco_clean true

Nextflow Version 23.10.1 Nextflow Build 5891 Nextflow Compile Timestamp 12-01-2024 22:01 UTC nf-core/mag

anugos avatar Mar 13 '24 02:03 anugos

Hi @anugos This seems to be a common and 'unresolved' prokka error. The recommendation is posted here: https://github.com/tseemann/prokka/issues/402#issuecomment-1547340365

Please install PROKKA manually (e.g. via conda), cd into the work directory reported into the error, then use the command in the .command.sh file to re-run prokka, but without redirecting the stdout/in

jfy133 avatar Mar 15 '24 13:03 jfy133

Hey @anugos @jfy133 ! Found a bit of a workaround. I downloaded this container for Prokka and then modified my config to use this container. Also would not work via slurm submssion to our HPC, but did on the head node (?!), and then had to modify to run 1 at a time so tmp directories for prokka didn't overwrite eachother. On second thought, maybe using a different container was unnecessary but anyway.. Overall additions to config file:

process {
   executor = 'slurm'
   clusterOptions="-N 1 -p skylake,icelake"
  withName: PROKKA {
    container = '/<path>/prokka_1.14.6--pl5321hdfd78af_5.sif'
    executor = 'local'
    maxForks = 1
  }
}

roberta-davidson avatar Mar 20 '24 07:03 roberta-davidson

@roberta-davidson huh interesting... what was the actual error for you (i.e., what was otehrwise piped to nothing?

Is it a /tmp clash or something? This we can maybe set to use a the process' specific work directory...

jfy133 avatar Mar 20 '24 07:03 jfy133

The original command in error from running mag was:

  [17:36:49] Could not run command: cat MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\/MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\.IS\.tmp\.44\.faa | parallel --gnu --plain -j 2 --block 43333 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\/MEGAHIT\-MetaBAT2\-Bushfire_A24936\.15\.IS\.tmp\.44\.blast 2> /dev/null

I really don't understand why this workaround works... I set out to do as you suggest above, and wrote a script to run .command.sh in each work dir using my own prokka container, and then @shyama-mama figured out to just adjust the config and point to that container when the pipeline runs. Then realised that .command.sh with my container worked on head node but not within the pipeline (no idea why). Then adjusted to execute locally, and one at a time.

roberta-davidson avatar Mar 20 '24 09:03 roberta-davidson