Poly A trimming doesn't work
Hello,
There was a previous issue which was supposed to address this issue, but it seems it was closed prematurely. Here it is.
https://github.com/OpenGene/fastp/issues/132
Users, including myself, are still dealing with this issue. It seems other types are trimming are working just fine, except for Poly A trimming.
Please address this, poly A tails are a major concern for RNA-Seq. Whether this is fixed or not will determine if we continue to use fastp. I'm sure this is true for many groups.
I am also dealing with this issue, but for use on some new EM-seq data, the libraries for which have been end repaired with dA-tailing (NEBNext). After running fastQC, there are apparently polyA tails on the R2 libraries that cannot be removed using fastp's polyX trimming.
Likely same issue happening on fastplong. I am dealing with a similar problem using PacBio HiFi reads.
Maybe cutadapt[1] is a working alternatives? We use cutadapt for all our RNAseq data (polyA/T, adaptors etc). It is actively developed and the author is responsive ;-)
[1] https://github.com/marcelm/cutadapt/
I have this issue as well, fastp appears to perform poorly on sequences with high rates of polyA and polyT tracts. cutadapt was the only way I could address this.
Guys, I will take this issue as first priority, and will fix it asap.
Can you show me some polyA FASTQ records that are trimmed correctly by other tools, but not trimmed by fastp?
Can you share some data here? I can hardly reproduce this issue with my own RNA-seq data. I suspect that this issue only happens when the template length is usually less than the sequencing length.
@downingtim @KatharineME @daonslog can you share some polyA reads that cannot be trimmed by fastp?
Once I can reproduce it, I will fix it very soon.