ampliseq icon indicating copy to clipboard operation
ampliseq copied to clipboard

Add demultiplexing step

Open DiegoBrambilla opened this issue 6 years ago • 12 comments

Hi, A very helpful feature to add would be the demultiplexing of the reads as an optional step. This function has already been developed on QIIME2 and, as such, it should be possible to add it to rrna-ampliseq pipeline.

DiegoBrambilla avatar Jan 18 '19 12:01 DiegoBrambilla

This might be a helpful feature. As far as I know there is work ongoing for wrapping DADA2 directely in this pipeline instead of QIIME2 using DADA2. Therefore I am unsure how to integrate this feature sustainably with the major changes that are planned to the early workflow. However, PRs are welcome.

d4straub avatar Jul 13 '19 19:07 d4straub

@DiegoBrambilla is planning to implement dada2 for PacBio analysis and could immediately add that demultiplexing step :)

d4straub avatar Aug 10 '19 08:08 d4straub

We take it into consideration. For the time being, implementing the R-DADA2 pipeline, taxonomy annotation from several sources and dealing with PacBio reads take priority.

DiegoBrambilla avatar Aug 26 '19 13:08 DiegoBrambilla

Demultiplexing could be done via cutadapt as documented here. I never come across the need for demultiplexing in the pipeline, but if anyone does, please mention it here and I might further look into it.

d4straub avatar Jan 03 '22 08:01 d4straub

I want to add demultiplexing (with Cutadapt) to Ampliseq. The way I've handled demultiplexing in my own nf-core style pipeline is to ask the user to specify the path to their raw data in the command line --raw_data "/path/to/data/*{R1,R2}*.fastq.gz". Then in the sample sheet the user has to add the columns fw_index, rv_index, fw_primer, and rv_primer (the two rv_ columns can be empty for single-end data). I use the _index columns for demultiplexing and the _primer columns for trimming after demultiplexing. The main issue I see is that Ampliseq doesn't require a sample sheet as input, so I'm wondering if anyone has a suggestion for a better way of adding this feature to Ampliseq? Maybe the sample sheet should be required if the user wants to demultiplex?

a4000 avatar Aug 02 '23 03:08 a4000

What about adding a few optional columns (such as fw_index, rv_index) to the sample sheet. If those columns are present, demultiplexing will run. If that might mess too much with existing routines, a separate input file (e.g. --demultiplex "sheet.tsv") that contains the necessary information (samplesheet & demultiplexsheet have identical IDs) might be an option? While ampliseq does not require a samplesheet (folder input & fasta input are also allowed), for demultiplexing that would be fine. After all, a samplesheet can handle more info than a folder input. Not all input options need to support all functionality, imho.

d4straub avatar Aug 02 '23 06:08 d4straub

To me, adding columns to the sample sheet sounds best.

erikrikarddaniel avatar Aug 02 '23 07:08 erikrikarddaniel

Hi there,

I’m curious if it’s now possible to utilize AmpliSeq with the combinatorial dual indexing system and perform demultiplexing directly in the pipeline as part of the AmpliSeq workflow. Could someone please clarify? Thanks!

NoMeatNo avatar Apr 29 '24 22:04 NoMeatNo

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

a4000 avatar Apr 29 '24 23:04 a4000

@NoMeatNo unfortunately no. That's not a part of Ampliseq yet.

Oh, I see. Thanks @a4000 for the quick response.

In the meantime, what’s the best strategy to follow? Would using Cutadapt and then Ampliseq be effective? How about q2-demux?

Earlier, you mentioned a method for demultiplexing in your own nf-core style pipeline, which involved specifying the path to raw data and using specific columns in the sample sheet. Could you provide more details on how you managed it? I’d appreciate any additional information you can share

NoMeatNo avatar Apr 29 '24 23:04 NoMeatNo

@NoMeatNo I haven't tried q2-demux, but I do recommend following Cutadapt's documentation on demultiplexing here. Using Cutadapt then Ampliseq should be effective.

a4000 avatar Apr 30 '24 00:04 a4000

You could also check out https://nf-co.re/demultiplex (that I have never used) to apply first and then use ampliseq. If you do, let us know if that works as expected. Just dont do primer trimming or any quality filtering!

d4straub avatar Apr 30 '24 06:04 d4straub