BifurcationKit: Bad performance relative to user supplied vector field in a function
Hi,
I was expecting MTK to be much faster than a user passed vector field and ForwardDiff jacobian. This is not the case. Note that in the latest version of BifurcationKit, you can pass the inplace vector field and the inplace jacobian.
Perhaps, it could be easier to define a new type like BifurcationProblemMTK which redefine residual, residual!, jacobian, jacobian! and the @optic functionality so that closures are avoided.
using BifurcationKit, ModelingToolkit
@variables t x(t) y(t)
@parameters μ
D = Differential(t)
eqs = [D(x) ~ μ * x - y - x * (x^2 + y^2),
D(y) ~ x + μ * y - y * (x^2 + y^2)]
@named osys = ODESystem(eqs, t)
osys = complete(osys)
bif_par = μ
plot_var = x
p_start = [μ => 1.0]
u0_guess = [x => 0.0, y => 0.0]
bprob = BifurcationProblem(osys,
u0_guess,
p_start,
bif_par;
plot_var = plot_var,
jac = false)
using BenchmarkTools
_out = zeros(2)
_J = zeros(2,2)
@btime BifurcationKit.residual($bprob, $(bprob.u0), $(bprob.params)) # 28.769 ns (1 allocation: 80 bytes)
@btime BifurcationKit.residual!($bprob, $_out, $(bprob.u0), $(bprob.params)) # 14.389 ns (0 allocations: 0 bytes)
@btime BifurcationKit.jacobian($bprob, $(bprob.u0), $(bprob.params)) # 323.101 ns (6 allocations: 432 bytes)
@btime BifurcationKit.jacobian!($bprob, $_J, $(bprob.u0), $(bprob.params)) # 308.194 ns (5 allocations: 336 bytes)
function Fsl!(out, X, p)
(;μ, ) = p
x, y = X
ua = x^2 + y^2
out[1] = μ * x - y - x * (x^2 + y^2)
out[2] = x + μ * y - y * (x^2 + y^2)
out
end
bprob2 = BifurcationProblem(Fsl!, zeros(2), (μ=0.1, b=0), (@optic _.μ))
@btime BifurcationKit.residual($bprob2, $(bprob2.u0), $(bprob2.params)) # 15.238 ns (1 allocation: 80 bytes)
@btime BifurcationKit.residual!($bprob2, $_out, $(bprob2.u0), $(bprob2.params)) # 1.916 ns (0 allocations: 0 bytes)
@btime BifurcationKit.jacobian($bprob2, $(bprob2.u0), $(bprob2.params)) # 289.651 ns (6 allocations: 432 bytes)
@btime BifurcationKit.jacobian!($bprob2, $_J, $(bprob2.u0), $(bprob2.params)) # 280.585 ns (5 allocations: 336 bytes)
It should just match in this case, not be any faster or slower. It would be good to figure out why it's not in this case.
The inplace jacobian should be much faster. If I code it myself, it is like 5ns
did you get a clue?
Sorry, forgot about this, will look this afternoon
bump! (friendly one)
On 9.76 it has improved, I get
julia> @btime BifurcationKit.residual($bprob, $(bprob.u0), $(bprob.params)) # 20.186 ns (1 allocation: 80 bytes)
julia> @btime BifurcationKit.residual!($bprob, $_out, $(bprob.u0), $(bprob.params)) # 5.6 ns (0 allocations: 0 bytes)
julia> @btime BifurcationKit.jacobian($bprob, $(bprob.u0), $(bprob.params)) # 270.429 ns (6 allocations: 432 bytes)
julia> @btime BifurcationKit.jacobian!($bprob, $_J, $(bprob.u0), $(bprob.params)) # 260.717 ns (5 allocations: 336 bytes)
The final difference is likely the difference between NaNMath.^ and literal_pow, where in the non-symbolic version it's specializing on the fact that ^ to an integer and thus changing it to x*x whereas the Symbolics generated code is using NaNMath.^(x,2) which I don't know if it matches the literal pow path? It might be making that a 2.0.
eqs = [D(x) ~ μ * x - y - x * (x*x+y*y),
D(y) ~ x + μ * y - y *(x*x+y*y)]
function Fsl!(out, X, p)
(;μ, ) = p
x, y = X
ua = x^2 + y^2
out[1] = μ * x - y - x * (x*x + y*y)
out[2] = x + μ * y - y * (x*x + y*y)
out
end
gives no change