OOM with extract calls for very large bam files
I'm running modkit extract calls (v0.5.0) on some very large bams on my HPC, and it seems to be failing out with an OOM even if I have 50GB of memory set. If the reads are streaming to the extract calls processor, I don't understand why it is giving an OOM crash? If this is expected behavior what is the heuristic for how to set the memory requirement?
Edit: my command is:
modkit extract calls --threads 10 --mapped-only --pass-only --reference $ref --cpg --log-filepath $outfile3 $outfile1 $outfile2
Hello @billytcl,
Sorry about that. I'm working on making this faster and less resource hungry. Could you try passing --ignore-index or --queue-size 1000? The former will almost certainly use less memory, but will run slower, the latter could be quicker and use less memory. If you don't want to fiddle around with it I would use --ignore-index.
Just following up on this, I did --queue-size 1000 (AFAIK the default is 10k?) and it still OOMs out on me. How much slower is --ignore-index? I'm also using 10 threads. Would that be an issue too? Or is there a heuristic that I can use to estimate memory usage?
Edit: I lowered it to 500 queue size and interval size to 5000 and it's still crashing on some files. Perhaps there could be a modkit feature where it can somehow autosample to guess at the right parameters without OOMing? Eg. specifying --max-mem=X and it will figure it out?
Hello @billytcl,
I'd recommend using --ignore-index to unblock your workflow in the short term. The whole pileup and extract apparatus is going through a re-write at the moment and the next version will be quite a bit better.
Do you have an idea on the relative speed difference with --ignore-index being turned off? I have to set an HPC job time.
Hello @billytcl,
Sorry for the delay. I did a quick test on some ~30X C+A all context data and --ignore-index didn't change the run time at all. It's still not as fast as it should be, but --ignore-index uses less memory and is about the same speed.
That's great! Trying out --ignore-index now. Actually, it would be even better if it had a resume feature if it crashes due to OOM/HPC timeout. Doing extract calls on a ~40-50X genome takes quite a while...