Bracken
Bracken copied to clipboard
Add option for `bracken-build` to invoke Kraken 2 with `--memory-mapping` flag
I tried to build a bracken nt
database, but I ran into this error:
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
This was caused by the kraken2
process running out of memory. For context, the nt
database I'm using (k=36) takes up roughly 220 GiB, and the system I'm running it on has about 250 GiB of RAM. To try to solve this, I patched bracken-build
so that it passes a --memory-mapping
flag to kraken2
, which I think will work (if you're either patient enough or have access to a scratch volume with fast I/O). If you want, I can share that patch.
@ fanninpm , could you please share with me this patch? I ran into similar problem with my custom database.
@savytskanatalia Unfortunately, I no longer have the exact copy of the script that let it work for me, but you can go digging in bracken-build
to find where that shell script invokes kraken2
and add the --memory-mapping
flag to it (probably line 139). Be aware that it would cause your execution times to slow down drastically, as I/O is orders of magnitude slower than RAM.
@fanninpm Thank you! I will give it a try ><.
Its surprising that with memory across 2 nodes ~330GB, I keep the std::bad_alloc
error.
However, making that changes --memory-mapping
to line 139 and 143 to the script bracken-build
installed using conda, seems to be working. At the time of writing, no memory related error is shown, its passed that.
Hope this helps others using conda installation and seeing memory error.