Jérémie du Boisberranger
Jérémie du Boisberranger
It's not an oversubscription issue here. They are not nested. I first call gemm and then call a function which executes a parallel loop (with no blas inside).
Also I ran `ps -o nlwp ` on the process running my python script and it returns 8 which is what I'd expect: 4 from openblas and 4 from openmp
Sorry I'm not sure to understand your answer. I agree that HT is useless in HPC most of the time but it does not seem to be the only issue...
> can you also try with OMP_PROC_BIND=TRUE? @isuruf it reduces the time from 21s to 2.6s. It's better but still much slower than expected
here's the output of `cat /proc/self/status | grep _allowed` ``` Cpus_allowed: ff,ffffffff Cpus_allowed_list: 0-39 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 Mems_allowed_list: 0 ``` Sorry but I've no idea how to interpret this :/
I introspected the affinities for both OpenMP and OpenBLAS threadpools and it turns out that no affinity constraint is set (openblas is built with NO_AFFINITY). Here's the output of `sched_getaffinity`...
@isuruf I think this issue is a good reason to always try to use an openblas built with openmp for the scikit-learn builds on conda-forge (I noticed it was not...
Hi, has the status on this evolved ? I find it unexpected as well that setting the number of threads for openblas changes the internal state of openmp. I have...
Also I tested: ```py openblas_set_num_threads(1); omp_set_num_threads(original_omp_num_threads); # caling openblas runs as if I didn't change any num_threads # (at least it seems after doing some benchmarks) # even if openblas_get_num_threads();...
I'm not sure it would fit nicely in `threadpoolctl.thread_limits`, since this is specific to OpenMP. But maybe through a different, more polyvalent, context manager. I don't think that the default...