funcscan icon indicating copy to clipboard operation
funcscan copied to clipboard

GUNZIP module is confused by repeated patterns in sample name and fasta path.

Open m3hdad opened this issue 2 months ago • 0 comments

Description of the bug

A couple of months ago we had a slack thread about how NFCORE_FUNCSCAN:FUNCSCAN:GUNZIP_PYRODIGAL_FNA gets confused with meta.id and gz file path if sample names are repeated along full path to fasta file.

The topic is discussed here on slack. Fix: Changing sample names solves the problem.

ERROR ~ Error executing process > 'NFCORE_FUNCSCAN:FUNCSCAN:GUNZIP_PYRODIGAL_FNA ([GCA_001438805.1.fna.gz, GCA_001438805.1_ASM143880v1_genomic.fna.gz])'

Caused by:
  Missing output file(s) `GCA_001438805.1.fna GCA_001438805.1_ASM143880v1_genomic.fna.gz` expected by process `NFCORE_FUNCSCAN:FUNCSCAN:GUNZIP_PYRODIGAL_FNA ([GCA_001438805.1.fna.gz, GCA_001438805.1_ASM143880v1_genomic.fna.gz])`

Command executed:

  # Not calling gunzip itself because it creates files
  # with the original group ownership rather than the
  # default one for that user / the work directory
  gzip \
      -cd \
       \
      GCA_001438805.1.fna.gz GCA_001438805.1_ASM143880v1_genomic.fna.gz \
      > GCA_001438805.1.fna GCA_001438805.1_ASM143880v1_genomic.fna.gz

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_FUNCSCAN:FUNCSCAN:GUNZIP_PYRODIGAL_FNA":
      gunzip: $(echo $(gunzip --version 2>&1) | sed 's/^.*(gzip) //; s/ Copyright.*$//')
  END_VERSIONS

Command exit status:
  0

Command output:
  (empty)

Work dir:
  /home/test/.work/e1/6e71e01659781bea8f6deb48144838

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

 -- Check '.nextflow.log' file for details
ERROR ~ Failed to invoke `workflow.onComplete` event handler

 -- Check script '/home/.nextflow/assets/nf-core/funcscan/./workflows/funcscan.nf' at line: 314 or see '.nextflow.log' file for more details

The input file which resulted in this error was:

sample,fasta
GCA_000184535.1,/home/test/genomes/ncbi_dataset/data/GCA_000184535.1/GCA_000184535.1_ASM18453v1_genomic.fna
GCA_000260455.1,/home/test/genomes/ncbi_dataset/data/GCA_000260455.1/GCA_000260455.1_ASM26045v1_genomic.fna
GCA_000615725.1,/home/test/genomes/ncbi_dataset/data/GCA_000615725.1/GCA_000615725.1_ASM61572v1_genomic.fna

changing the input file to the following fixed the issue:

sample,fasta
sample-1,/home/test/genomes/ncbi_dataset/data/GCA_000184535.1/GCA_000184535.1_ASM18453v1_genomic.fna
sample-2,/home/test/genomes/ncbi_dataset/data/GCA_000260455.1/GCA_000260455.1_ASM26045v1_genomic.fna
sample-3,/home/test/genomes/ncbi_dataset/data/GCA_000615725.1/GCA_000615725.1_ASM61572v1_genomic.fna

Command used and terminal output

No response

Relevant files

No response

System information

No response

m3hdad avatar Apr 26 '24 09:04 m3hdad