memory allocation failed
Hi, I'm using GGCAT for building eulertigs. So far, I had no problem and successfully built eulertigs for large collections.
But I noticed that on the whole 661k "Blackwell" collection, as well as on this collection of fungi https://zenodo.org/records/17093970 which counts 1624 genomes, the computation gets aborted at its final stage with the same message:
Started phase: eulertigs building [step1]
memory allocation of 309237645312 bytes failed
Aborted (core dumped)
It seems that the algorithm is trying to allocate >300GB in RAM in both scenarios, so this looks suspicious to me. But isn't the the max RAM usage capped with option -m?
This is the command I'm using, just for reference:
ggcat build -k 31 -j 64 -l ~/jgi_fungi_filenames.txt -s 1 --eulertigs -o jgi_fungi.k31.eulertigs.fa -t tmp_dir -m 64
hence, 64 parallel threads and 64 GB of RAM.
In any case, I also have ~500GB of RAM on the test machine I'm using, so I don't know why that message is showing up and causes the crash.
Any help very much appreciated. thanks!
Best, -Giulio
I forgot to mention that I'm using ggcat_cmdline 2.0.0.
(I retried the previous command with less threads and less memory but I got the same error.)
Oh, I now see that the process is taking far more memory than I expected: already 350G out of my 500G available.
Is this normal? I thought the -m parameter was there to avoid this problem, but I notice you wrote "This usage does not include the needed memory for the processing steps."...
Hi Giulio, I was able to reproduce the bug also in the latest version, so it's definitely a thing to fix. I suspect it's caused by some deserialization error while decoding a sequence length, that causes the big allocation you pointed out. I will try to fix it as soon as I can
Thank you Andrea!
Hi Giulio, I fixed some bugs in dev and tested the construction, now it works on my machine. Can you test it? Thanks, Andrea
Thank you Andrea.
How should I build the project now that I'm on the dev branch?
I tried cargo build and cargo install --path crates/cmdline/ --locked as described in the README but both fail.
Ok, I managed to build it :) I needed to first rustup update.
Hi Andrea, on the entire Blackwell 661k, it failed again with the same error message:
started phase: maximal unitigs links building [step 3]
memory allocation of 154618822656 bytes failed
Aborted (core dumped)
Even if large, I have 0.5TB of RAM, so the allocation should be possibile.