flowcraft
flowcraft copied to clipboard
Handle single fastq files as inputs
Right now, assemblerflow accepts only paired end read files (fastq), however it would be handy to add support for single fastq files.
I agree that this will be a good thing to support. From the point of view of assemblerflow, the required modification is trivial. The fromFilePairs
channel could be defined as:
Channel.fromFilePairs(params.reads, size: params.singleEnd ? 1 : 2, type: 'file')
To allow for both single and paired-end data. However, the majority of the modifications would be on the template scripts themselves, which are mostly designed for paired end data. Moreover, we would need to create a requirement that templates using fastq data would need to support from single and paired-end files, whenever possible.
When that isn't a possibility, that should be explicit in the documentation of the component using the template (and perhaps including a check in pipelines using those components that prevent the input of single end data.
Also a statement like this can be used in each process template that requires to handle both paired end or single end inputs.
This will be highly dependent on how the software handles paired end and on whether you will be using subprocess in python or using bash directly.
On Sat, 14 Apr 2018, 20:05 Tiago Jesus, [email protected] wrote:
Also a statement like this https://github.com/ODiogoSilva/assemblerflow/blob/patlas/assemblerflow/generator/templates/mapping_patlas.nf#L21-L26 can be used in each process template that requires to handle both paired end or single end inputs.
— You are receiving this because you commented.
Reply to this email directly, view it on GitHub https://github.com/ODiogoSilva/assemblerflow/issues/62#issuecomment-381352322, or mute the thread https://github.com/notifications/unsubscribe-auth/ABdhhNpcVzuXG0Tm1mIveUSfQoK10s-fks5tokiLgaJpZM4TTe-A .
Yes, my point is precisely that we can leave that handler to the process itself or to the python script as you say.
I've been receiving request to support single FastQ files as input. @tiagofilipe12 example is no longer available but I like @ODiogoSilva solution of having both possibilities in each template. A simple solution is to duplicate the scripts and adjust accordingly.
Double the scripts, double the maintenance effort. It seems easier to have the FastQ channel accepting both paired and single reads and then add a condition in the templates depending on the number of fastq received.
I agree. I was just thinking about tools where you have to input each fastq file in a different parameter or that have completely different parameters depending if you're working with paired-end or single fastq files.
Does Flowcraft support interleaved paired-end FASTQ file, for example as output by samtools mergepe
? That's my preferred format, by far, for paired-end reads.
I opened issue https://github.com/assemblerflow/flowcraft/issues/136 for this feature request.