ThreadsX.jl icon indicating copy to clipboard operation
ThreadsX.jl copied to clipboard

Performance of map

Open bkamins opened this issue 3 years ago • 2 comments

@tkf what is the reason of the following performance comparison? (this is a fresh Julia session on Win11 with 8 threads):

julia> Threads.nthreads()
8

julia> x = rand(10^8) .- 0.5;

julia> @time map(abs, x);
  0.251657 seconds (99.43 k allocations: 768.457 MiB, 18.93% gc time, 14.49% compilation time)

julia> @time map(abs, x);
  0.239117 seconds (3 allocations: 762.940 MiB, 22.72% gc time)

julia> using ThreadsX

julia> @time ThreadsX.map(abs, x);
  1.842571 seconds (2.53 M allocations: 4.024 GiB, 10.90% gc time, 33.76% compilation time)

julia> @time ThreadsX.map(abs, x);
  1.176356 seconds (1.51 k allocations: 3.888 GiB, 14.73% gc time)

If I use 1, 2, or 4 threads the situation is similar.

Thank you!

bkamins avatar Jun 12 '22 09:06 bkamins

Hoping to mitigate this somewhat with https://github.com/JuliaFolds/Transducers.jl/pull/553. However, Transducers.jl (and by extension, ThreadsX.jl), is at it's worst when dealing with very fast functions like abs.

The easiest fix for you to get better performance would be to use ThreadsX.map!(abs, similar(x), x).

MasonProtter avatar May 04 '23 06:05 MasonProtter