torchkbnufft
torchkbnufft copied to clipboard
Potential issue with shell-based thread management
I made a change in #27 that could be a potential issue, so I would like to document it here. Particularly, in this commit the following lines were commented out:
if USING_OMP and cpu_device:
torch.set_num_threads(num_threads)
The reason for this was that in PyTorch 1.8 these lines led to a severe performance regression - i.e., at the Python level it seemed PyTorch wasn't handling switching the number of available threads very well. I removed the lines as the regression was too large.
The downside is that shell-based OMP thread management may be ignored within forks - OMP specifies the number of threads that can be spawned but does not keep a global limit, so if your global limit is 8 and you fork 8 times, each one of those forks could create 8 new threads and lead to oversubscription.
In general it's a niche issue that I hope doesn't affect most people. I haven't been able to figure out how to fix this - the answer may be to just go to C++ as mentioned in Issue #28 if anyone decides to tackle this.