Likely erroneous use of `Threads.nthreads` and `Threads.threadid`
Hello,
From the following PkgEval log: https://s3.amazonaws.com/julialang-reports/nanosoldier/pkgeval/by_hash/5da257d_vs_d63aded/RecurrenceAnalysis.primary.log it seems likely that this package is using Threads.nthreads and Threads.threadid incorrectly as described in https://julialang.org/blog/2023/07/PSA-dont-use-threadid/ and https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/tls/tls/#The-naive-(and-incorrect)-approach.
In the upcoming Julia 1.12, julia runs with an interactive thread by default which means that the number of worker threads nthreads() defaults to 1 but the thread id that things will run on in a threaded context will typically be 2 (the interactive thread having the id 1):
julia> Threads.nthreads()
1
julia> Threads.@threads for i = 1:3
@show Threads.threadid()
end
Threads.threadid() = 2
Threads.threadid() = 2
Threads.threadid() = 2
Trying to index a vector of length Threads.nthreads() with Threads.threadid() will thus error.
Note that even though this code only started directly erroring in 1.12 the code was likely already incorrect on earlier Julia versions unless it made sure that the spawned tasks did not migrate between threads (which can happen at any yield point).
Some remedies to this are:
- Use the new
OncePerThread(orOncePerTask) functionality added in 1.12. - Use task local values instead of thread local values as described in https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/tls/tls/#TLV
- Use a
Channelas described in https://juliafolds2.github.io/OhMyThreads.jl/stable/literate/tls/tls/#The-safe-way:-Channel. - Use
Threads.maxthreadid()instead ofThreads.nthreads(). This can waste some memory sincemaxthreadid()can give a higher number than the number of worker threads. It also doesn't help against the task migration issue described earlier (but the previous code did not do that either).
After looking into the blog post of OhMyThreads, it seems to me that the best approach is using Channels. That doesn't depend on backwards-incompatible features of 1.12, and task local values are not adequate for the type of tasks that are parallelized in this package (the values that are computed in the parallelized loop are needed after the loops end).
@asinghvi17 : since you designed the parallelized code in the first place, do you want to try a fix on this? Otherwise I might do an attempt, based on the blog's examples.
Thanks for the ping - can try a fix this week or weekend. Yeah I probably wrote that before the advice was out :D
I have to fix this same issue in 5 more repositories of JuliaDynamics... So if someone can work on a fix here I would appreciate it, I would save me a lot of work by copying your effort!
If @asinghvi17 cannot dedicate time to this, I could start an attempt this weekend.
I'm working on a brief patch now using a task based approach