cutadapt icon indicating copy to clipboard operation
cutadapt copied to clipboard

Add "{rc}" demultiplexing template variable

Open AvdReis opened this issue 3 years ago • 3 comments

Hi there

Loving this --revcomp function!

I was wondering if it is possible to be selective in which of the reads to keep, i.e., reads identified with adapters in the 5'-3' or reads identified as revcomp? Is this being thought of as a possible addition in the next release?

I am busy working with Nanopore MinION data and need to keep both, but would still like to keep them separate.

Cheers Aimee

AvdReis avatar Apr 21 '21 05:04 AvdReis

Do I understand correctly that you would like to get two output files, one with the reads that were and one with the reads that were note reverse-complemented? This is not directly supported when you use --revcomp, but you could manually provide your adapter(s) both in the forward and reverse-complemented version and then use demultiplexing to get multiple output files. So something like this:

cutadapt -a fwd=ADAPTER -a revcomp=REVCOMP_OF_ADAPTER -o "trimmed.{name}.fastq.gz" input.fastq.gz

Then you’d get trimmed.fwd.fastq.gz and trimmed.revcomp.fastq.gz. Note that the --revcomp option should not be provided in this case.

marcelm avatar Apr 21 '21 11:04 marcelm

Thanks for this. Yes, I was hoping to get two files out. I ended up rather taking the untrimmed sequences and reverse complementing them, then searching for the primers in the forward direction to ensure the sequences would be in the 5' - 3' direction once trimmed.

AvdReis avatar Apr 28 '21 12:04 AvdReis

Ah right, of course the command I gave doesn’t do the normalization that --revcomp would do.

No promises I will have time to do this anytime soon, and more of a note to myself, perhaps a command like this could be made to work:

cutadapt --revcomp -a ADAPTER -o "trimmed.{rc}.fastq.gz" input.fastq.gz

To determine to which file a trimmed read needs to be written, the {rc} would be replaced with something like forward/reverse depending on whether the read was reverse-complemented or not. So you would end up with trimmed.forward.fastq.gz and trimmed.reverse.fastq.gz, and the reads would still be normalized because --revcomp was used.

marcelm avatar Apr 29 '21 09:04 marcelm