modules icon indicating copy to clipboard operation
modules copied to clipboard

Set BWA and bwamem2 index memory dynamically

Open edmundmiller opened this issue 1 year ago • 10 comments

Kept having bwamem2 index tasks that ran forever and failed. Updated bwamem2 to use 28.B of memory per byte of fasta. Issue for reference: https://github.com/bwa-mem2/bwa-mem2/issues/9

Also tracked down the required memory for bwa index while I was at it. Doesn't seem to fail because most of the genome requirements are under the necessary memory.

Not the first place where people have run into this https://github.com/nf-core/modules/pull/6628

edmundmiller avatar Sep 11 '24 17:09 edmundmiller

I like it 👍 it’ll also play nice with the new resourceLimits directive

matthdsm avatar Sep 11 '24 18:09 matthdsm

I was talking to @drpatelh about this earlier this week. Sounds good. Very neat if it scales in such a linear way.

Should we add a baseline of additional memory?

ewels avatar Sep 11 '24 19:09 ewels

Closes https://github.com/nf-core/sarek/issues/1377

edmundmiller avatar Sep 11 '24 21:09 edmundmiller

What can we do for bwa mem and bwamem2 mem?

maxulysse avatar Sep 12 '24 15:09 maxulysse

What can we do for bwa mem and bwamem2 mem?

What do you mean?

edmundmiller avatar Sep 24 '24 13:09 edmundmiller

Is it right to have these settings hardcoded in the module ? How does it interact with pipeline-level config file doing

withName BWAMEM2_INDEX {
    memory { ... }
}

which one takes precedence ?

muffato avatar Oct 16 '24 09:10 muffato

AFAIK the pipeline config takes precedence over the config hardcoded in the module. If you're worried about requesting too much resources, the resourceLimit directive should take care of that nicely

matthdsm avatar Oct 16 '24 09:10 matthdsm

I was more worried that 28 GB / Gbp is still too high in my view. I use 24 GB / Gbp in my pipelines and wouldn't want nf-core to force me to waste RAM ;) Also, your memory definition doesn't consider task.attempt. Are you absolutely certain that 28 GB / Gbp will work for every genome ? Usually, nf-core resource definitions always factor task.attempt.

I wasn't worried of check_max missing since nf-core is about to mandate a recent Nextflow that supports resourceLimit.

muffato avatar Oct 16 '24 10:10 muffato

FYI, I've just checked our LSF logs and there's been zero memory failures over the 1,698 BWAMEM2_INDEX processes that we ran in 2024 with 24 GB/Gbp. The memory efficiency is ~76% (median), and goes up to 95%, meaning that 23 GB/Gbp might still work for all genomes (it's just at the limit), but 22 GB/Gbp for sure would yield some memory errors.

Regardless of the scaling factor you use, I'd still keep task.attempt just in case (I'm overcautious !).

muffato avatar Oct 16 '24 10:10 muffato

Regardless of the scaling factor you use, I'd still keep task.attempt just in case (I'm overcautious !).

I think that's a great point, definately we should definately add that in these.

Any opinions on the scaling factor? Should we really double it everytime or would 1.5x be generous enough?

FYI, I've just checked our LSF logs and there's been zero memory failures over the 1,698 BWAMEM2_INDEX processes that we ran in 2024 with 24 GB/Gbp.

Power to you! 😆 I vote we go with what was mentioned in the bwamem2 issue that they're expecting it to use. Unless you have a better link to point at when people start asking why their jobs keep failing.

edmundmiller avatar Oct 16 '24 20:10 edmundmiller

We can revert this if it breaks stuff for some reason. Let's merge it and see if anyone has issues since it's gone stale except for @muffato's comments, which I addressed.

edmundmiller avatar Mar 11 '25 18:03 edmundmiller

I've noticed a small issue with this when running on a small genome with docker:

docker: Error response from daemon: Minimum memory limit allowed is 6MB.

But nothing that can't be fixed with a label selector

maxulysse avatar Mar 18 '25 12:03 maxulysse

We could also just fix it in the module. Make sure it's not equal to 0 or less than 6.

edmundmiller avatar Mar 18 '25 14:03 edmundmiller

We could take the max between the minimal requirement and the memory we compute there

maxulysse avatar Mar 18 '25 14:03 maxulysse