kmer-db icon indicating copy to clipboard operation
kmer-db copied to clipboard

out-of-memory

Open JhinAir opened this issue 1 year ago • 3 comments

Hi, @agudys

When I run kmer-db, I always came across the issue of 'out-of-memory', even testing with very small input fasta or the example fa in your 'data' directory. My command is: kmer-db build -k 31 -t 2 test.list test.db

The file test.list only includes one sample: data/MN908947, but this command would consume more than 100Gb of memory, and get killed: /var/spool/slurm/d/job1132243/slurm_script: line 10: 1165089 Killed kmer-db build -k 31 -t 2 test.list test.db slurmstepd: error: Detected 9 oom-kill event(s) in StepId=1132243.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

The same error occurred when I chose Kmer file produced by KMC as input. How could this be? Could you please help me with this? Thank you! Best~

Jing

JhinAir avatar Jun 10 '24 06:06 JhinAir

This problem occurred when kmer size is set to 31, which is larger than the maxium in the latest version. Why is that?

JhinAir avatar Jun 10 '24 07:06 JhinAir

I'm also having the same problem, runs out of memory pretty quickly for 31-mers, even for a very small fasta file (180 sequences with ~6kb each).

A quick benchmark for this fasta file:

for kmer_len in 18 24 28 31; do
    kmer-db build -k ${kmer_len} -multisample-fasta -alphabet nt -t 4 seqs.fa my_db
done
K-mer length Number of samples Number of patterns (Bytes) Number of k-mers
18 180 176 (11,056 B) 15,059
24 180 201 (12,608 B) 15,827
28 180 202 (12,688 B) 16,332
31 180 OOM killed OOM killed
  • 18-mer: db generated in 0.17s with 0.093 Gb peak memory
  • 24-mer: db generated in 0.18s with 0.105 Gb peak memory
  • 28-mer: db generated in 14.04s with 3.377 Gb peak memory (high memory usage but still works)
  • 31-mer: killed due to out-of-memory even though >40 Gb was available in the machine

ziggy-zaia avatar Jul 10 '25 22:07 ziggy-zaia