diamond icon indicating copy to clipboard operation
diamond copied to clipboard

Paired end fastq queries exit with "Error: Unequal number of sequences in paired read files."

Open ryandkuster opened this issue 2 years ago • 2 comments

When running blastx using paired end fastq-formatted reads as query input:

diamond blastx -d reference.dmnd  -o results.tsv -q R1.fastq R2.fastq

...the job fails with the message:

Error: Unequal number of sequences in paired read files.

This error occurs with any dataset, regardless of compression. A pair of test files with a single R1 and a single R2 read also produces the error. This error has been produced in conda 2.1.8 and binaries 2.1.8 and 2.1.0. The binary of 2.0.15 works with the exact same usage, so this may have occurred before the 2.1.0 release.

I've seen workarounds preparing paired-ends by merging, but this may not always be ideal if reads don't overlap well. Thank you!

ryandkuster avatar Sep 07 '23 13:09 ryandkuster

Ok, looks like this needs to be fixed. But note that using 2 files like this is not different from just aligning each file separately, the information of paired reads is not really used.

bbuchfink avatar Sep 14 '23 11:09 bbuchfink

Hello guys,

Same issue here even though my fastq files are properly paired. Version 2.1.9.163

Thank you

deyvidamgarten avatar Mar 07 '24 15:03 deyvidamgarten