spades
spades copied to clipboard
spades correction results in ill-formatted reads
Hello, I'm trying to assemble some metagenomes downloaded from EBI, and running into issues with SPAdes outputting fastq reads where the quality line is not the same length as the sequence line. This leads to SPAdes failing with the following error:
0:20:05.202 792M / 792M ERROR General (paired_readers.hpp : 56) The number of right read-pairs is larger than the number of left read-pairs
0:20:05.202 792M / 792M ERROR General (paired_readers.hpp : 60) Unequal number of read-pairs detected in the following files: /import/c1/NANOBASE/recollins/metta/assembly/spades-scratch/spades_BMI-AADIOSF-3-C7C8WACXX-IND34-clean_2020-03-04-20-07-40/corrected/trimmed_BMI-AADIOSF-3-C7C8WACXX-IND34-clean_S000_L001_R1_001.fastq.00.0_0.cor.fastq.gz /import/c1/NANOBASE/recollins/metta/assembly/spades-scratch/spades_BMI-AADIOSF-3-C7C8WACXX-IND34-clean_2020-03-04-20-07-40/corrected/trimmed_BMI-AADIOSF-3-C7C8WACXX-IND34-clean_S000_L001_R2_001.fastq.00.0_0.cor.fastq.gz
== Error == system call for: "['/home/recollins/apps/SPAdes-3.13.0-Linux/bin/spades-core', '/import/c1/NANOBASE/recollins/metta/assembly/spades-scratch/spades_BMI-AADIOSF-3-C7C8WACXX-IND34-clean_2020-03-04-20-07-40/K21/configs/config.info']" finished abnormally, err code: 255
reads: ftp.sra.ebi.ac.uk/vol1/run/ERR358/ERR3589564/BMI_AADIOSF_3_1_C7C8WACXX.IND34_clean.fastq.gz ftp.sra.ebi.ac.uk/vol1/run/ERR358/ERR3589564/BMI_AADIOSF_3_2_C7C8WACXX.IND34_clean.fastq.gz
raw FASTQ read:
@H4:C7C8WACXX:3:2207:3174:68099/1
AAAAAAAAATCTAAACGCTAATGCTGAAAAAGNATCACTATTATCTATTATTGGTTTTGTGGTAACAAACGCCGATGACCACAAGATAATAAAAATAAATG
+
@@@DF@FFFHAHHJIHIHAFGIIJICAGCHGG#-7BFGB@GGIIIEIBEEEHHH?;@;@.>A;@CDCD@A?/=9@>B@CCCA1<BCCCDCACC(:<CCDEC
bbduk filtered read
@H4:C7C8WACXX:3:2207:3174:68099/1
AAAAAAAAATCTAAACGCTAATGCTGAAAAAGNATCACTATTATCTATTATTGGTTTTGTGGTAACAAACGCCGATGACCACAAGATAATAAAAATAAATG
+
@@@DF@FFFHAHHJIHIHAFGIIJICAGCHGG!-7BFGB@GGIIIEIBEEEHHH?;@;@.>A;@CDCD@A?/=9@>B@CCCA1<BCCCDCACC(:<CCDEC
SPAdes-3.13.0-Linux corrected read
@H4:C7C8WACXX:3:2207:3174:68099/1 BH:changed:3
AAAAAAAAATCTAAACGCTAATGCTGAAAAAGGATCACTATTATCTATTATTGGTTTTGTTGTAACAAAAGCCGATGACCACAAGATAATAAAAATAAATG
+H4:C7C8WACXX:3:2207:3174:68099/1 BH:changed:3
@@@DDDDDDDDDDDDCDD
I should mention this is not an end-of-file issue, the total number of reads is equal using wc -l
Hello
Will it be possible to upload your spades.log file?
I'm running it now with 14.0 to see if it changes
Looks like one of the files got truncated, probably during the gzip compression – the number of reads written by BayesHammer and the number of reads received by SPAdes differ.
I did wc -l
on the spades-corrected files and got the same number