SemiBin icon indicating copy to clipboard operation
SemiBin copied to clipboard

Inquiry on error

Open LanSabb opened this issue 9 months ago • 1 comments

Hi

I am attempting to run semibin2 with the following command : SemiBin2 single_easy_bin -t 80 -i Contigs.fa -b Contigs.bam -o Result -environment global

However, the following error is repeated. Can you recommend the best solution for this issue?

Thanks

Error 2025-04-10 20:14:33 sandia SemiBin2[72604] INFO Running SemiBin2 version 2.2.0 2025-04-10 20:14:33 sandia SemiBin2[72604] INFO Binning for short_read 2025-04-10 20:14:34 sandia SemiBin2[72604] WARNING Did not detect GPU or CUDA was not installed/supported, using CPU. 2025-04-10 20:14:49 sandia SemiBin2[72604] INFO Generating training data... 2025-04-10 20:48:30 sandia SemiBin2[72604] INFO Calculating coverage for every sample. 2025-04-10 21:04:36 sandia SemiBin2[72604] INFO Processed: /data/MJ/Sejong/Platform/Novaseq_Coassembly2/0_Prep/4_Mapping/4_sort/Contigs.bam 2025-04-10 21:07:06 sandia SemiBin2[72604] INFO Start binning. Traceback (most recent call last): File "/opt/anaconda3/envs/semibin/bin/SemiBin2", line 10, in sys.exit(main2()) File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1625, in main2 single_easy_binning( File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1317, in single_easy_binning binning_short(**binning_kwargs) File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/main.py", line 1222, in binning_short cluster( File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 289, in cluster embedding, contig_labels = run_embed_infomap(logger, model, data, File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 155, in run_embed_infomap kl = cal_kl(depth[:,2k], depth[:, 2k + 1]) File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/SemiBin/cluster.py", line 54, in cal_kl res = ne.evaluate( File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/numexpr/necompiler.py", line 973, in evaluate return re_evaluate(local_dict=local_dict, global_dict=global_dict, _frame_depth=_frame_depth) File "/opt/anaconda3/envs/semibin/lib/python3.9/site-packages/numexpr/necompiler.py", line 1005, in re_evaluate return compiled_ex(*args, **kwargs) numpy._core._exceptions._ArrayMemoryError: Unable to allocate 1.26 TiB for an array with shape (588720, 588720) and data type float32

LanSabb avatar Apr 10 '25 12:04 LanSabb

This seems to be another instance of semibin not scaling well in terms of memory usage for very large binning jobs: https://github.com/BigDataBiology/SemiBin/issues/171

luispedro avatar Apr 11 '25 00:04 luispedro