DynamicPolynomials.jl
DynamicPolynomials.jl copied to clipboard
Faster evaluation
I think I found something to improve the evaluation performance!
I benchmarked exponent, e.g. x^3, and found that it was quite slow :O
julia> x = rand();
julia> @benchmark ^($x, $3)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 22.653 ns (0.00% GC)
median time: 25.014 ns (0.00% GC)
mean time: 27.691 ns (0.00% GC)
maximum time: 260.665 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 995
But from this issue I found that there is a special LLVM instruction which is exposed under Base.FastMath.pow_fast
julia> @benchmark Base.FastMath.pow_fast($x, $3)
BenchmarkTools.Trial:
memory estimate: 0 bytes
allocs estimate: 0
--------------
minimum time: 3.586 ns (0.00% GC)
median time: 4.128 ns (0.00% GC)
mean time: 4.423 ns (0.00% GC)
maximum time: 194.732 ns (0.00% GC)
--------------
samples: 10000
evals/sample: 1000
On the other hand Base.FastMath.pow_fast is quite slow for Complex128 (around 100ns) so I checked the normal ^ and found that it is actually faster than the Float64 case 🤔. I opened an issue in the julia repository, let's see how it goes from there.
Good catch ! Looks intriguing indeed, I am curious to see the result of the discussion on the julia repo.