Bracken
Bracken copied to clipboard
kmer2read_distr killed at STEP 3 (READING DATABASE.KRAKEN FILE)
Hello. I am trying to build a bracken DB for my custom Kraken2 DB similar to maxikraken2_1903_140GB ( https://lomanlab.github.io/mockcommunity/mc_databases.html ). As I have not found any distribution of Bracken DB for maxikraken2 DB I went on regenerating it to obtain necessary files for Bracken. In the previous step I end up with database.kraken of 729 Gb. My RAM is 377 Gb, so whenever I run kmer2read_distr --seqid2taxid the job gets killed, as I suspect due to running out of RAM. Is there any way to split database or omit it being fully loaded to RAM while building DB? Example:
/opt/bracken/src/kmer2read_distr --seqid2taxid Krakenstein2/seqid2taxid.map --taxonomy Krakenstein2/taxonomy --kraken Krakenstein2/database.kraken --output Krakenstein2/Krakenstein_150mers.kraken -k 35 -l 150 -t 32
>>STEP 0: PARSING COMMAND LINE ARGUMENTS
Taxonomy nodes file: Krakenstein2/taxonomy/nodes.dmp
Seqid file: Krakenstein2/seqid2taxid.map
Num Threads: 32
Kmer Length: 35
Read Length: 150
>>STEP 1: READING SEQID2TAXID MAP
26618342 total sequences read
>>STEP 2: READING NODES.DMP FILE
2416809 total nodes read
>>STEP 3: READING DATABASE.KRAKEN FILE
13513290Killednces read...
And example of message from OOM:
dmesg -T | egrep -i 'killed process'
[Mon May 9 19:29:08 2022] Out of memory: Killed process 2370765 (kmer2read_distr) total-vm:397024724kB, anon-rss:389071168kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:777036kB oom_score_adj:0
Hello @savytskanatalia ! I ran into similar problem with my custom Kraken2 Db. Did you manage to find any work around this issue ?
@Smedard my workaround was transferring and restarting the process on a machine with 1 Tb RAM... Though it is still running Step 4 of DB generation.