bart icon indicating copy to clipboard operation
bart copied to clipboard

Thread oversubscription on multicore systems

Open headmeister opened this issue 3 years ago • 1 comments

Hello, I encountered this error on our machine (bart v0.8 compiled from source) when using the ecalib through the python wrapper. Our machine has 128 cores and it failed with stating:

BLAS : Program is Terminated. Because you tried to allocate too many memory regions. Segmentation Fault.

This problem is most likely related to this openBLAS issue : https://github.com/awslabs/autogluon/issues/1020 When I limited the number of threads for openBLAS and OMP via an environ. variable to 32, it fixed itself. This might be an issue for others too, when bart is executed on bigger machines. I know this might not be the final solution, but can at least help in running Bart.

Best Regards, Jiri

headmeister avatar Oct 17 '22 10:10 headmeister

Thanks for pointing this out!

uecker avatar Oct 22 '22 10:10 uecker