Ferrite.jl
Ferrite.jl copied to clipboard
improve threading performance example
For four threads improves performance by ~20% but for 16 threads improves performance by 3x.
This is likely because I am now allocating the scratch space on each separate thread which removes false sharing.