NiLang.jl
NiLang.jl copied to clipboard
add `i_mean` and `i_sum`, tweak `i_mean_sum` performance
I don't know whether @inbounds
and @simd
is allowed in NiLang.
x = rand(64, 64)
@btime i_mean_sum(0.0, 0.0, $x)
# 228.673 ns (0 allocations: 0 bytes) # PR
# 3.557 μs (0 allocations: 0 bytes) # master
Interestingly, i_mean_sum
is more performant than i_mean
. Is this because the additional uncompute process is not optimized?
julia> @btime i_mean(0.0, $x);
476.871 ns (0 allocations: 0 bytes)
Edit: b7cc221 makes i_mean
as performant as i_mean_sum
. (Not quite understanding what's happening there..)