Unicycler icon indicating copy to clipboard operation
Unicycler copied to clipboard

Support interleaved paired-end reads [feature request]

Open sjackman opened this issue 8 years ago • 6 comments

Hi, Ryan. Please consider supporting assembling interleaved paired-end reads in one FASTQ file. Thanks!

sjackman avatar Aug 18 '17 02:08 sjackman

As a workaround…

seqtk seq -1 foo.fq.gz | pigz -p64 >foo.1.fq.gz
seqtk seq -2 foo.fq.gz | pigz -p64 >foo.2.fq.gz
unicycler -t64 -o foo.unicycler -1 foo.1.fq.gz -2 foo.2.fq.gz
# Separate the first read from an interleaved FASTQ file.
%.1.fq.gz: %.fq.gz
	seqtk seq -1 $< | pigz -p$t >$@

# Separate the second read from an interleaved FASTQ file.
%.2.fq.gz: %.fq.gz
	seqtk seq -2 $< | pigz -p$t >$@

# Assemble reads using Unicycler.
%.unicycler/assembly.fasta: %.1.fq.gz %.2.fq.gz
	unicycler -t$t -o $(@D) -1 $*.1.fq.gz -2 $*.2.fq.gz

sjackman avatar Aug 18 '17 02:08 sjackman

Thanks - I'll add this to my future feature list. In the meantime, I appreciate the workaround!

rrwick avatar Aug 18 '17 11:08 rrwick

If you do, please consider implementing the same smart pairing feature as bwa mem -p, which allows for unpaired reads in the same interleaved paired-end file.

❯❯❯ bwa mem |& grep -- -p
       -p            smart pairing (ignoring in2.fq)

sjackman avatar Aug 18 '17 17:08 sjackman

As a command line interface, I'd suggest…

unicycler -o output [file…]

where [file…] is one or more interleaved FASTA/FASTQ possibly compressed files.

sjackman avatar Aug 18 '17 17:08 sjackman

I'd still love to see support for interleaved FASTQ files for unicycler and unicycler_polish.

sjackman avatar Oct 01 '18 16:10 sjackman

Hi Ryan,

Just wondering if there was any plan to implement this sometime soon or if it has been implemented but is just undocumented?

Cheers

rhysnewell avatar Apr 18 '21 22:04 rhysnewell