TrimGalore icon indicating copy to clipboard operation
TrimGalore copied to clipboard

(new) problem with output having reads that don't have a mate in R1 and R2 files.

Open bmillerlab opened this issue 1 year ago • 2 comments

Hello, I used trim-galore last year to clean up a lot of fastq.gz files I received from my service provider. It worked great and I was able to use the files for downstream analysis with no problem. I just tried using the same commands on newly received fastq.gz files from the same library prep (nextera) and the same service provider and repeated the cleanup with the same commands:

trim_galore --cores 8 --paired -o galore_output --length 50 --nextera -a2 GTGTAGAGCC -q 25 --fastqc [long list of files names here]

The fastqc.html looks fine (actually I used multiqc to look at the aggregate data), the adapters are removed and poor quality sequence is trimmed . However, this time I am getting a subset of sequences that no longer pair in the R1 and R2 files for my downstream analysis which is causing it to fail. I used geneious prime to test several pairs of files by trying to merge them and was able to confirm that all of them have unpaired reads where trim-galore left different sequences in R1 versus R2. I went back and tested the same files before they were "cleaned", slow process because they are large (still have low qc sequences present), in the same way and they all pair perfectly - so the problem definitely arose during the processing.

Could you suggest where the problem may be arising so that I can try and fix the files? I did update to the newest TrimGalore version, should I rollback to an older version if this a new issue?

Thank you for any suggestions you may have.

bmillerlab avatar Oct 07 '22 20:10 bmillerlab