MKL.jl icon indicating copy to clipboard operation
MKL.jl copied to clipboard

Setting MKL thread number

Open hakkelt opened this issue 1 year ago • 2 comments

I'm not sure if I understood the docs correctly, but I assumed that BLAS.set_num_threads(n) == mkl_set_num_threads(n) and BLAS.get_num_threads() == mkl_get_max_threads() . It seems, however, that it might not be the case. At least, if I'm calling mkl_set_num_threads first, BLAS.get_num_threads shows the correct value, but calling BLAS.set_num_threads does not affect the result of mkl_get_max_threads. Moreover, if I'm calling BLAS.set_num_threads first, then subsequent calls to mkl_set_num_threads would not change the value returned by BLAS.get_num_threads... I also tried MKL_DYNAMIC=TRUE, but there was no difference. Can someone clarify what's going on?

$ MKL_DYNAMIC=FALSE julia --project

julia> using MKL, MKL.MKL_jll

julia> using LinearAlgebra

julia> get_max_threads() = ccall((:mkl_get_max_threads, libmkl_rt), Int32, ());

julia> set_max_threads(n) = ccall((:mkl_set_num_threads, libmkl_rt), Cvoid, (Ptr{Int32},), Ref(Int32(n)));

julia> mkl_get_dynamic() = ccall((:mkl_get_dynamic, libmkl_rt), Int32, ());

julia> mkl_get_dynamic()
0

julia> BLAS.get_num_threads()
96

julia> get_max_threads()
96

julia> set_max_threads(55)

julia> get_max_threads()
55

julia> BLAS.get_num_threads()
55

julia> BLAS.set_num_threads(60)

julia> BLAS.get_num_threads()
60

julia> get_max_threads()
55
$ MKL_DYNAMIC=FALSE julia --project

julia> using MKL, MKL.MKL_jll

julia> using LinearAlgebra

julia> get_max_threads() = ccall((:mkl_get_max_threads, libmkl_rt), Int32, ());

julia> set_max_threads(n) = ccall((:mkl_set_num_threads, libmkl_rt), Cvoid, (Ptr{Int32},), Ref(Int32(n)));

julia> mkl_get_dynamic() = ccall((:mkl_get_dynamic, libmkl_rt), Int32, ());

julia> mkl_get_dynamic()
0

julia> BLAS.get_num_threads()
96

julia> BLAS.set_num_threads(55)

julia> BLAS.get_num_threads()
55

julia> get_max_threads()
96

julia> set_max_threads(60)

julia> get_max_threads()
60

julia> BLAS.get_num_threads()
55

hakkelt avatar Sep 20 '24 21:09 hakkelt

Hi @hakkelt, in terms of oneMKL logic BLAS.set_num_threads(n) == mkl_set_num_threads(n) and BLAS.get_num_threads() == mkl_get_max_threads() are not always correct. As far as I know BLAS.set_num_threads uses mkl_domain_set_num_threads and BLAS.get_num_threads() uses mkl_domain_get_max_threads. In case of domain specific function mkl_domain_set_num_threads was not used your assumption is correct: BLAS should use number of thread defined by mkl_set_num_thread, but if mkl_domain_set_num_threads was defined BLAS should use domain specific number of threads instead of the number specified by more general mkl_set_num_threads function. Hope it clarifies the oneMKL behavior.

mkrainiuk avatar Oct 06 '24 06:10 mkrainiuk

Hi @mkrainiuk, thanks for the clarification; it helped a lot!

However, the problem I faced in my project remains: BLAS.set_num_threads sets the BLAS and LAPACK thread count when using OpenBLAS, but only the BLAS thread count is when using MKL. Moreover, it is currently not possible to set only LAPACK thread count with MKL, but either only for BLAS or all for all MKL domains together. In the Intel forum, they said they will consider adding that domain for a future release... :/

In my current project, I managed to solve the problem by calling C-API directly and setting the thread count for all domains:

mkl_get_num_threads() = ccall((:mkl_get_max_threads, libmkl_rt), Int32, ())
mkl_set_num_threads(n) = ccall((:mkl_set_num_threads, libmkl_rt), Cvoid, (Ptr{Int32},), Ref(Int32(n)))

Wouldn't it make sense to expose these functions in MKL.jl and add some explanation to readme? Or is there a more straightforward solution?

hakkelt avatar Dec 16 '24 19:12 hakkelt

I'm not sure if a new Issue should be created, but I also wanted to revive this one for a way to change threads at runtime. Since BLAS.set_num_threads doesn't actually make MKL single threaded, this is incorrect

VinceNeede avatar Apr 27 '25 21:04 VinceNeede

For reference, I'll leave here the Issue started by @hakkelt himself on the MKL page

https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-set-number-of-LAPACK-threads-during-runtime/m-p/1638274#M36554

VinceNeede avatar Apr 28 '25 18:04 VinceNeede

Fixed in #180

ViralBShah avatar Jun 04 '25 13:06 ViralBShah