smrnaseq
smrnaseq copied to clipboard
Parameter `--skip_fastp` throws an error, parameter `trim_fastq` set to false not working as expected
Description of the bug
As part of my testing on comparing trimming parameters, I would like to use input data that has been trimmed, and turn off all trimming steps in of the pipeline.
First I tried with adding skip_fastp
to my run. This resulted in the following error
ERROR ~ Error executing process > 'NFCORE_SMRNASEQ:SMRNASEQ:MIRTRACE:MIRTRACE_RUN (1)'
Caused by:
Not a valid path value type: java.util.ArrayList ([/sfs/9/ws/qeajl01-smrnaseq_test/data/smrnaseq_trimmed_reads/QBCOS019AT_trimmed.fastq.gz])
However, if I remove the parameter --skip_fastp
and add the following parameters instead:
--trim_fastq false \
--clip_r1 0 \
--three_prime_clip_r1 0 \
--fastp_min_length 15 \
The pipeline completed without any errors. However, looking at the multiQC it appears there are still some adapter trimming being performed.
Could it be that the default sequence of --three_prime_adapter
is being used for adapter trimming?
Thank you.
Command used and terminal output
nextflow run nf-core/smrnaseq
-r 2.2.1
-profile cfc
--input /sfs/9/ws/qeajl01-smrnaseq_test/data/samplesheet_1_trimmed_nfcore.csv
--genome GRCh38
--skip_fastp
--mirtrace_species hsa
--hairpin /sfs/9/ws/qeajl01-smrnaseq_test/data/mirBase/hairpin.fa
--mature /sfs/9/ws/qeajl01-smrnaseq_test/data/mirBase/mature.fa
--mirna_gtf /sfs/9/ws/qeajl01-smrnaseq_test/data/mirBase/hsa.gff3
--outdir /sfs/9/ws/qeajl01-smrnaseq_test/results/out_smrnaseq_1_nfcore_skipTrim_skipfastp
Relevant files
No response
System information
Nextflow version: 23.04.2 build 5870 Hardware: HPC Executor: slurm Container engine: Singularity OS: CentOS Version of nf-core/smrnaseq: 2.2.1
I have the same problem and i suspect its because the files are gzipped and mirtrace doesn't try to unzip them. Try unzipping the files and then running the command/
I attempted to reproduce this error with the following command in latest dev version:
nextflow run smrnaseq -profile docker --outdir issue_236_skip_fastp -resume --skip_fastp --input /workspace/smrnaseq/assets/samplesheet.csv --mirtrace_species hsa
The pipeline finished correctly and the fastp step was not executed.
However, according to the methods in the paper, miRTrace applies its own trimming logic, which includes removing reads shorter than 18 nucleotides after adapter trimming, handling specific adapter sequences, even if the data has already been trimmed before being passed into the pipeline.
The "reads < 18 nt after adapter removal" metric in the MultiQC report is sourced from the mirtrace-results.json file generated by miRTrace, specifically from the statsQC array. This metric counts reads that were trimmed to a length of less than 18 nucleotides by miRTrace during its processing, which indicates that miRTrace is still performing trimming. This means that disabling external trimming steps (e.g., --skip_fastp
, --trim_fastq false
) affects the initial trimming phases in the pipeline, but the internal miRTrace trimming is independent of these settings.
If the pipeline profile is set to a specific protocol (e.g., illumina, qiaseq, cats, nextflex), the miRTrace module in this pipeline will adjust its processing steps to match the structure of the reads expected for that protocol. If no protocol is specified, miRTrace defaults to the illumina protocol.
If you want miRTrace to handle the trimming in a specific way, set the profile explicitly to one of those available in the pipeline.
However, if you do not want the internal trimming in miRTrace it is possible to disable it. You should use protocol 'custom' which will default to no protocol option and therefore to adapter none.
In the older version (-r 2.2.1
) of the MIRTRACE_RUN process, the adapter parameter was explicitly passed to miRTrace using the --adapter flag
. If --skip_fastp
was not set, then the adapter sequences were obtained from this process. This should be resolved now.
I am working on a test case that uses --skip_fastp
.
This error is linked to #367 . The input channel for mirTrace
requires an adapter sequence value that is absent when profile protocol custom
or --skip_fastp
is applied.
The adapter sequence is not used in the mirTrace
module, so it could be removed.
cc @nschcolnicov
Closed via #383