fastp icon indicating copy to clipboard operation
fastp copied to clipboard

adapter detection gets stuck and uses all memory on ONT data

Open mcolpus opened this issue 6 months ago • 2 comments

Hello,

I'm using fastp in a pipeline and found that some samples were giving out of memory errors when going through fastp. I've attached a small example: small.fq.gz

I'm using 0.24.1 and I just run:

fastp -i small.fq.gz  -o clean.fastq.gz

It displays:

Detecting adapter sequence for read1...

but gets stuck whilst increasingly using more ram (100G+)

There are some very short reads so I tried filtering any <100bp long first, but that doesn't fix the issue.

If I disable adapter trimming then the sample does run completely fine.

Thanks, Let me know if you need any more info

mcolpus avatar May 02 '25 13:05 mcolpus

I've been running into some memory issues as well (albeit on short read sequencing data) and took a quick look at the adapter trimming logic using your test file. I encountered the same spike in memory usage, and it looks like its occurring in the Evaluator::getAdapterWithSeed function, where some FASTQ entries are having dozens or hundreds of candidate adapter hits, which are all being stored and used to determine the actual adapter sequence.

ckrushton avatar May 05 '25 10:05 ckrushton

I will fix this soon

sfchen avatar May 30 '25 22:05 sfchen