spades icon indicating copy to clipboard operation
spades copied to clipboard

How do I assemble a merged paired-data using spades?

Open roseedwin opened this issue 1 year ago • 6 comments

Description of bug

I have paired end file from Illumina, eg. AT11B_trimmed1.fastq and AT11B_trimmed2.fastq. I have used BBMerge to merge the two files to get AT11B_merged.fastq and AT11B_unmerged.fastq. How exactly do i specify the unmerged file?

I tried using: spades.py --meta -1 Trial/AT11B_trimmed1.fastq -2 Trial/AT11B_trimmed2.fastq --merged Trial/AT11B_merged.fastq -o Trial/AT11B_spades -k 21,33,55,77 --phred-score 33 The spades.log shows that "Files with merged reads were ignored".

I also tried: spades.py --meta --merged Trial/AT11B_merged.fastq Trial/AT11B_unmerged.fastq -o Trial/AT11B_out -k 21,33,55,77 --phred-score 33 which also didn't work.

I guess my question is how exactly do I flag the unmerged reads as mentioned in the manual? (Non-empty files with (remaining) unmerged left/right reads (separate or interlaced) must be provided for the same library for SPAdes to correctly detect the original read length.

spades.log

spades.log

params.txt

params.txt

SPAdes version

v3.15.3

Operating System

Linux

Python Version

No response

Method of SPAdes installation

conda

No errors reported in spades.log

  • [X] Yes

roseedwin avatar May 10 '23 15:05 roseedwin

Specify merged reads with --merged, unmerged reads with -s, and the original unmerged R1 and R2 reads with -1 and -2.

lmolokin avatar May 12 '23 17:05 lmolokin

Thank you so much for this! Really appreciate the help!!

But I do have a further question if that's okay. I just realized that I have two unmerged reads AT11B_unmerged1.fastq and AT11B_unmerged2.fastq. How would they be flagged if I have two unmerged reads?

roseedwin avatar May 14 '23 19:05 roseedwin

You would first interleave them into a single fastq file using tools such as seqtk, bbtools, etc.

lmolokin avatar May 15 '23 02:05 lmolokin

After re-reading the documentation and a few related threads #751 #765 #891 the correct way actually appears to be:

--merged merged.fastq
-1 R1_remaining_unmerged.fastq
-2 R2_remaining_unmerged.fastq

Issue 751 initially threw me off but it looks like more recent discussions agree with the lines above.

lmolokin avatar May 22 '23 19:05 lmolokin

Hi,

Thank you, however, I tried this. If you look at my first query and the log file you can see that I had provided --merged -1 and -2 flags initially. And spades ended up ignoring the merged reads as mentioned in the log file.

roseedwin avatar May 23 '23 08:05 roseedwin

Hi,

Thank you, however, I tried this. If you look at my first query and the log file you can see that I had provided --merged -1 and -2 flags initially. And spades ended up ignoring the merged reads as mentioned in the log file.

That message only applies to the read length estimation step which is also evident from the log. One of the other referenced discussions asks the same question and the author confirms that merged reads are in fact being used.

lmolokin avatar May 23 '23 10:05 lmolokin