rnaseq
rnaseq copied to clipboard
GUNZIP does not work for relative paths specified with --fasta and --gtf
Description of the bug
unzipping of fasta and gtf files fail, when I provide a relative path to those files. I can work around by either unzipping the file first, or by providing an absolute path. Below, I provide a minimal example for just the gtf file, I get the same kind of error for the fasta file.
Command used and terminal output
$ wget https://ftp.ensembl.org/pub/release-111/gtf/mus_musculus/Mus_musculus.GRCm39.111.gtf.gz
$ nextflow run nf-core/rnaseq -profile test -r 3.14.0 --outdir output --gtf Mus_musculus.GRCm39.111.gtf.gz
N E X T F L O W ~ version 24.04.2
Launching `https://github.com/nf-core/rnaseq` [tiny_engelbart] DSL2 - revision: b89fac3265 [3.14.0]
------------------------------------------------------
,--./,-.
___ __ __ __ ___ /,-._.--~'
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/rnaseq v3.14.0-gb89fac3
------------------------------------------------------
Core Nextflow options
revision : 3.14.0
runName : tiny_engelbart
launchDir : /lustre/projects/Fabian_Rost/temp/rnaseq
workDir : /lustre/projects/Fabian_Rost/temp/rnaseq/work
projectDir : /home/rost/.nextflow/assets/nf-core/rnaseq
userName : rost
profile : test
configFiles :
Input/output options
input : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/samplesheet/v3.10/samplesheet_test.csv
outdir : output
Reference genome options
fasta : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/genome.fasta
gtf : Mus_musculus.GRCm39.111.gtf.gz
gff : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/genes.gff.gz
transcript_fasta : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/transcriptome.fasta
additional_fasta : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/gfp.fa.gz
hisat2_index : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/hisat2.tar.gz
rsem_index : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/rsem.tar.gz
salmon_index : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/salmon.tar.gz
Read filtering options
bbsplit_fasta_list : https://raw.githubusercontent.com/nf-core/test-datasets/7f1614baeb0ddf66e60be78c3d9fa55440465ac8/reference/bbsplit_fasta_list.txt
UMI options
umitools_bc_pattern : NNNN
Alignment options
pseudo_aligner : salmon
min_mapped_reads : 5
Process skipping options
skip_bbsplit : false
Institutional config options
config_profile_name : Test profile
config_profile_description: Minimal test dataset to check pipeline function
Max job request options
max_cpus : 2
max_memory : 6.GB
max_time : 6.h
!! Only displaying parameters that differ from the pipeline defaults !!
------------------------------------------------------
If you use nf-core/rnaseq for your analysis please cite:
* The pipeline
https://doi.org/10.5281/zenodo.1400710
* The nf-core framework
https://doi.org/10.1038/s41587-020-0439-x
* Software dependencies
https://github.com/nf-core/rnaseq/blob/master/CITATIONS.md
WARN: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Both '--gtf' and '--gff' parameters have been provided.
Using GTF file as priority.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
WARN: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'--transcript_fasta' parameter has been provided.
Make sure transcript names in this file match those in the GFF/GTF file.
Please see:
https://github.com/nf-core/rnaseq/issues/753
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX -
[- ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX -
[- ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ -
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA [ 0%] 0 of 1
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX [ 0%] 0 of 1
[- ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ [ 0%] 0 of 1
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC [ 0%] 0 of 2
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF_FILTER -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_ADDITIONAL_FASTA [ 0%] 0 of 1
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CAT_ADDITIONAL_FASTA -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GTF2BED -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:CUSTOM_GETCHROMSIZES -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:STAR_GENOMEGENERATE -
[- ] process > NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:UNTAR_SALMON_INDEX [ 0%] 0 of 1
[- ] process > NFCORE_RNASEQ:RNASEQ:CAT_FASTQ [ 0%] 0 of 2
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:FASTQC [ 0%] 0 of 2
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_FASTQC_UMITOOLS_TRIMGALORE:TRIMGALORE [ 0%] 0 of 2
[- ] process > NFCORE_RNASEQ:RNASEQ:BBMAP_BBSPLIT -
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:FQ_SUBSAMPLE -
[- ] process > NFCORE_RNASEQ:RNASEQ:FASTQ_SUBSAMPLE_FQ_SALMON:SALMON_QUANT -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:STAR_ALIGN -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_SORT -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:SAMTOOLS_INDEX -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT -
[- ] process > NFCORE_RNASEQ:RNASEQ:ALIGN_STAR:BAM_SORT_STATS_SAMTOOLS:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SALMON_QUANT -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:TX2GENE -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:TXIMPORT -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE_LENGTH_SCALED -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_GENE_SCALED -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_STAR_SALMON:SE_TRANSCRIPT -
[- ] process > NFCORE_RNASEQ:RNASEQ:DESEQ2_QC_STAR_SALMON -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:PICARD_MARKDUPLICATES -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:SAMTOOLS_INDEX -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_STATS -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_FLAGSTAT -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_MARKDUPLICATES_PICARD:BAM_STATS_SAMTOOLS:SAMTOOLS_IDXSTATS -
[- ] process > NFCORE_RNASEQ:RNASEQ:STRINGTIE_STRINGTIE -
[- ] process > NFCORE_RNASEQ:RNASEQ:SUBREAD_FEATURECOUNTS -
[- ] process > NFCORE_RNASEQ:RNASEQ:MULTIQC_CUSTOM_BIOTYPE -
[- ] process > NFCORE_RNASEQ:RNASEQ:BEDTOOLS_GENOMECOV -
[- ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDCLIP -
[- ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_FORWARD:UCSC_BEDGRAPHTOBIGWIG -
[- ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDCLIP -
[- ] process > NFCORE_RNASEQ:RNASEQ:BEDGRAPH_BEDCLIP_BEDGRAPHTOBIGWIG_REVERSE:UCSC_BEDGRAPHTOBIGWIG -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUALIMAP_RNASEQ -
[- ] process > NFCORE_RNASEQ:RNASEQ:DUPRADAR -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_BAMSTAT -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INNERDISTANCE -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_INFEREXPERIMENT -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONANNOTATION -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_JUNCTIONSATURATION -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDISTRIBUTION -
[- ] process > NFCORE_RNASEQ:RNASEQ:BAM_RSEQC:RSEQC_READDUPLICATION -
[- ] process > NFCORE_RNASEQ:RNASEQ:QUANTIFY_PSEUDO_ALIGNMENT:SALMON_QUANT -
Plus 9 more processes waiting for tasks…
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/rnaseq] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_RNASEQ:RNASEQ:PREPARE_GENOME:GUNZIP_GTF'
Caused by:
Not a valid path value: 'Mus_musculus.GRCm39.111.gtf.gz'
Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line
-- Check '.nextflow.log' file for details
Relevant files
System information
- Nextflow 24.04.2
- HPC
- local executor
- CentOS Linux release 7.4.1708
- nf-core/rnaseq v3.14.0