Hi
I am attempting to run semibin2 with the following command : SemiBin2 single_easy_bin -t 80 -i Contigs.fa -b Contigs.bam -o Result -environment global
However, the following error is repeated. Can you recommend the best solution for this issue?
Thanks
Error
2025-04-10 20:14:33 sandia SemiBin2[72604] INFO Running SemiBin2 version 2.2.0
2025-04-10 20:14:33 sandia SemiBin2[72604] INFO Binning for short_read
2025-04-10 20:14:34 sandia SemiBin2[72604] WARNING Did not detect GPU or CUDA was not installed/supported, using CPU.
2025-04-10 20:14:49 sandia SemiBin2[72604] INFO Generating training data...
2025-04-10 20:48:30 sandia SemiBin2[72604] INFO Calculating coverage for every sample.
2025-04-10 21:04:36 sandia SemiBin2[72604] INFO Processed: /data/MJ/Sejong/Platform/Novaseq_Coassembly2/0_Prep/4_Mapping/4_sort/Contigs.bam
2025-04-10 21:07:06 sandia SemiBin2[72604] INFO Start binning.
Traceback (most recent call last):
File "/opt/anaconda3/envs/semibin/bin/SemiBin2", line 10, in
sys.exit(main2())
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1625, in main2
single_easy_binning(
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1317, in single_easy_binning
binning_short(**binning_kwargs)
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1222, in binning_short
cluster(
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 289, in cluster
embedding, contig_labels = run_embed_infomap(logger, model, data,
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 155, in run_embed_infomap
kl = cal_kl(depth[:,2k], depth[:, 2k + 1])
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 54, in cal_kl
res = ne.evaluate(
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/numexpr/necompiler.py", line 973, in evaluate
return re_evaluate(local_dict=local_dict, global_dict=global_dict, _frame_depth=_frame_depth)
File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/numexpr/necompiler.py", line 1005, in re_evaluate
return compiled_ex(*args, **kwargs)
numpy._core._exceptions._ArrayMemoryError: Unable to allocate 1.26 TiB for an array with shape (588720, 588720) and data type float32
This seems to be another instance of semibin not scaling well in terms of memory usage for very large binning jobs: https://github.com/BigDataBiology/SemiBin/issues/171