Illumina_NextSeq_Run dies immediately
Hi Mike,
I am trying to run some Illumina NextSeq samples, but I am failing immediately during primer removal with this error:
Error executing rule remove_5prime_primer on cluster (jobid: 254, external: Submitted batch job 37022315, jobscript: /scratch/sahlab/N812_TruSeqNano/.snakemake/tm p.78wklvoy/snakejob.remove_5prime_primer.254.sh).
java -ea -Xmx15200m -Xms15200m -cp /opt/apps/labs/sahlab/software/miniconda3/envs/hecatomb_v5/snakemake/workflow/conda/75d063ee1aa3bf1dbaf0307c4aa3f5ec/opt/bbmap-38.90-3/current/ jgi.BBDuk in=/scratch/sahlab/N812_TruSeqNano/fastq/WangD_I10209_cell_cutlure_32_14585_E09_UDI0069_GCCACAGGAT_ATGGCATG_S10_R1_001.fastq.gz in2=/scratch/sahlab/N812_TruSeqNano/fastq/WangD_I10209_cell_cutlure_32_14585_E09_UDI0069_GCCACAGGAT_ATGGCATG_S10_R2_001.fastq.gz ref=/opt/apps/labs/sahlab/software/miniconda3/envs/hecatomb_v5/snakemake/workflow/../../databases/contaminants/primerB.fa out=hecatomb_out/PROCESSING/TMP/p01/I10209_cell cutlure_32_14585_R1.s1.out.fastq out2=hecatomb_out/PROCESSING/TMP/p01/I10209_cell cutlure_32_14585_R2.s1.out.fastq stats=hecatomb_out/PROCESSING/STATS/p01/I10209_cell cutlure_32_14585.s1.stats.tsv k=16 hdist=1 mink=11 ktrim=l restrictleft=20 removeifeitherbad=f trimpolya=10 ordered=t rcomp=f ow=t threads=8 -Xmx15200m cutlure_32_14585.log Executing jgi.BBDuk [in=/scratch/sahlab/N812_TruSeqNano/fastq/WangD_I10209_cell_cutlure_32_14585_E09_UDI0069_GCCACAGGAT_ATGGCATG_S10_R1_001.fastq.gz, in2=/scratch/sahlab/N812_TruSeqNano/fastq/WangD_I10209_cell_cutlure_32_14585_E09_UDI0069_GCCACAGGAT_ATGGCATG_S10_R2_001.fastq.gz, ref=/opt/apps/labs/sahlab/software/miniconda3/envs/hecatomb_v5/snakemake/workflow/../../databases/contaminants/primerB.fa, out=hecatomb_out/PROCESSING/TMP/p01/I10209_cell, cutlure_32_14585_R1.s1.out.fastq, out2=hecatomb_out/PROCESSING/TMP/p01/I10209_cell, cutlure_32_14585_R2.s1.out.fastq, stats=hecatomb_out/PROCESSING/STATS/p01/I10209_cell, cutlure_32_14585.s1.stats.tsv, k=16, hdist=1, mink=11, ktrim=l, restrictleft=20, removeifeitherbad=f, trimpolya=10, ordered=t, rcomp=f, ow=t, threads=8, -Xmx15200m, cutlure_32_14585.log] Version 38.90
Exception in thread "main" java.lang.RuntimeException: Unknown parameter cutlure_32_14585_R1.s1.out.fastq
at jgi.BBDuk.
Is there an adjustment or flag for NextSeq data?
I found the problem:
When you specify a TSV file, e.g. hecatomb run --reads samples.tsv, Hecatomb expects a 3-column tab separated file with the first column specifying the sample name, and the other columns the relative or full paths to the forward and reverse read files. e.g.
sample1 /path/to/reads/sample1.1.fastq.gz /path/to/reads/sample1.2.fastq.gz sample2 /path/to/reads/sample2.1.fastq.gz /path/to/reads/sample2.2.fastq.gz
I inadvertently had a space in the sample name sample 1
Might be worth adding to docs, "If you specify a TSV file and Hecatomb fails immediately, check your TSV file for spaces"
Actually, I should add a check for spaces in file names when reading from a TSV.