NiLang.jl icon indicating copy to clipboard operation
NiLang.jl copied to clipboard

add `i_mean` and `i_sum`, tweak `i_mean_sum` performance

Open johnnychen94 opened this issue 3 years ago • 1 comments

I don't know whether @inbounds and @simd is allowed in NiLang.

x = rand(64, 64)
@btime i_mean_sum(0.0, 0.0, $x)
# 228.673 ns (0 allocations: 0 bytes) # PR
# 3.557 μs (0 allocations: 0 bytes) # master

Interestingly, i_mean_sum is more performant than i_mean. Is this because the additional uncompute process is not optimized?

julia> @btime i_mean(0.0, $x);
  476.871 ns (0 allocations: 0 bytes)

Edit: b7cc221 makes i_mean as performant as i_mean_sum. (Not quite understanding what's happening there..)

johnnychen94 avatar Apr 29 '21 16:04 johnnychen94