Segmentation fault during building
Hey!
Trying to build the index, using 10 threads, on a server with 128Gb ram, but keep getting a segmentation fault after a few hours. Any ideas?
Cheers , Rory
It's possible that you ran out of memory building the database. What genomes did you provide, and could I see the command you executed?
Edit: I’ve also added checks during insertion for the case of failure (e.g., running out of memory). The hyperloglog (I think -H option) spends some CPU time to estimate the final size of the hash table and can reduce peak memory footprint by a third, which can help, and, if the hash table would be too large, tells you the size of the estimated cardinality and then fails immediately when trying to allocate a table which is too large.
Hey, thanks for the quick reply! I provided the default selection from the python script, i.e bacteria, archaea, viruses and human. I ran the command from the readme,
bonsai build -e -w50 -k31 -p20 -T ref/nodes.dmp -M ref/nameidmap.txt bns.db `find ref/ -name '*.fna.gz'`
I'll try building again with the hyperlog flag!