OrdinaryDiffEq.jl Default solver selection very slow (x20) compared to `Rodas5P` on a given problem

The system is a bit unusual (has a large number of similar variables). I am also not sure how interesting cases like this is, and how feasible they are to deal with. However, since the performance difference is large, I figured it could be interesting to know.

# Fetch packages.
using Catalyst, OrdinaryDiffEqDefault, OrdinaryDiffEqRosenbrock, FiniteStateProjection, BenchmarkTools

# Create the model.
rs = @reaction_network begin
    (p,d), 0 <--> X
    (kB,kD), 2X <--> X2
end

# Create the ODEProblem.
ps = [:p => 1.0, :d => 0.2, :kB => 2.0, :kD => 5.0]
u0 = zeros(25,25)
u0[1,1] = 1.0
fsp_sys = FSPSystem(rs, [:X, :X2])
oprob = ODEProblem(fsp_sys, u0, 50.0, ps)

# Simulate using default solver and Rodas5P.
@btime sol1 = solve($oprob, Rodas5P()) #   126.148 ms (3719 allocations: 6.96 MiB)
@btime sol2 = solve($oprob) #   2.689 s (309050 allocations: 138.95 MiB)

Here, the state is a 25x25 matrix (so 625 variables in total).

May 13 '25 22:05 TorkelE

The number of allocations seems like a regression somewhere, some kind of inference issue.

May 13 '25 23:05 ChrisRackauckas

What's happening here is that this is a big enough problem (length 625) that we're using FBDF with an Krylov. Also, since the mass matrix isn't singular, we're starting the integration with Tsit5 which seems to cause the time-stepping to be worse. In short, we seem to be making basically every wrong choice possible for this problem (other than using FBDF, plain FBDF is doing great here).

julia> @btime solve(oprob, AutoTsit5(FBDF(linsolve=LinearSolve.KrylovJL(), autodiff=AutoFiniteDiff())));
  2.365 s (304006 allocations: 145.28 MiB)
julia> @btime solve(oprob, FBDF(linsolve=LinearSolve.KrylovJL(), autodiff=AutoFiniteDiff()));
  411.827 ms (49606 allocations: 19.18 MiB)
julia> @btime solve(oprob, FBDF(linsolve=LinearSolve.KrylovJL()));
  244.829 ms (35446 allocations: 18.30 MiB)
julia> @btime solve(oprob, FBDF());
  56.068 ms (1379 allocations: 7.25 MiB)

here are the associated stats:

julia> solve(oprob, AutoTsit5(FBDF(linsolve=LinearSolve.KrylovJL(), autodiff=AutoFiniteDiff()))).stats
SciMLBase.DEStats
Number of function 1 evaluations:                  78552
Number of function 2 evaluations:                  0
Number of W matrix evaluations:                    155
Number of linear solves:                           336
Number of Jacobians created:                       0
Number of nonlinear solver iterations:             336
Number of nonlinear solver convergence failures:   0
Number of fixed-point solver iterations:                     0
Number of fixed-point solver convergence failures:           0
Number of rootfind condition calls:                0
Number of accepted steps:                          129
Number of rejected steps:                          48
julia> solve(oprob, FBDF(linsolve=LinearSolve.KrylovJL(), autodiff=AutoFiniteDiff())).stats
SciMLBase.DEStats
Number of function 1 evaluations:                  13510
Number of function 2 evaluations:                  0
Number of W matrix evaluations:                    27
Number of linear solves:                           64
Number of Jacobians created:                       0
Number of nonlinear solver iterations:             64
Number of nonlinear solver convergence failures:   0
Number of fixed-point solver iterations:                     0
Number of fixed-point solver convergence failures:           0
Number of rootfind condition calls:                0
Number of accepted steps:                          59
Number of rejected steps:                          0
julia> solve(oprob, FBDF(linsolve=LinearSolve.KrylovJL())).stats
SciMLBase.DEStats
Number of function 1 evaluations:                  6633
Number of function 2 evaluations:                  0
Number of W matrix evaluations:                    28
Number of linear solves:                           69
Number of Jacobians created:                       0
Number of nonlinear solver iterations:             69
Number of nonlinear solver convergence failures:   0
Number of fixed-point solver iterations:                     0
Number of fixed-point solver convergence failures:           0
Number of rootfind condition calls:                0
Number of accepted steps:                          59
Number of rejected steps:                          1
julia> solve(oprob, FBDF()).stats
SciMLBase.DEStats
Number of function 1 evaluations:                  238
Number of function 2 evaluations:                  0
Number of W matrix evaluations:                    27
Number of linear solves:                           120
Number of Jacobians created:                       1
Number of nonlinear solver iterations:             120
Number of nonlinear solver convergence failures:   0
Number of fixed-point solver iterations:                     0
Number of fixed-point solver convergence failures:           0
Number of rootfind condition calls:                0
Number of accepted steps:                          61
Number of rejected steps:                          1

May 14 '25 17:05 oscardssmith

We can bump the cutoff to using Krylov methods to be higher. Since it's not preconditioned that would help in some cases.

May 14 '25 17:05 ChrisRackauckas

Also @TorkelE, could Catalyst start providing jacobian sparsity? It's a free 2x speedup.

julia> using SparseConnectivityTracer

julia> detector = TracerSparsityDetector()
TracerSparsityDetector()

julia> du0 = copy(u0);

julia> jac_sparsity = ADTypes.jacobian_sparsity(
           (du, u) -> oprob.f(du, u, oprob.p, 0.0), du0, u0, detector);

julia> ofun = ODEFunction(oprob.f.f; sparsity=float.(jac_sparsity));
julia> spoprob = ODEProblem(ofun, oprob.u0, oprob.tspan, oprob.p);
julia> @btime solve(spoprob, FBDF());
  22.222 ms (2947 allocations: 16.89 MiB)
julia> @btime solve(spoprob, FBDF(linsolve=LinearSolve.KLUFactorization()));
  7.483 ms (1780 allocations: 2.76 MiB)

It does seem like we're too small for the iterative solvers to help even with preconditioning. The best I've gotten with them is 36ms (or ~250ms without preconditioners). Shall we bump the krylov threshold to 2000?

May 14 '25 17:05 oscardssmith

yeah bump it up.

Also @TorkelE, could Catalyst start providing jacobian sparsity?

It should just always do that IMO

May 14 '25 17:05 ChrisRackauckas

at least for anything sufficiently large.

May 14 '25 17:05 ChrisRackauckas

Oh, part of what's happening here is really subtle and really sucks. FBDF is a pretty bad algorithm to use with auto-switching because it makes it so the cost of a failed switch to the nonstiff integrator is starting over from order 1. e.g. on the dense problem, we see that AutoTsit5(FBDF()) is ~2x slower than AutoTsit5(Rodas5P()) even though FBDF is ~4x faster than Rodas5P

julia> @btime solve(oprob, AutoTsit5(Rodas5P()));
  144.011 ms (3354 allocations: 7.24 MiB)

julia> @btime solve(oprob, AutoTsit5(FBDF()));
  337.748 ms (2689 allocations: 8.48 MiB)

julia> @btime solve(oprob, Rodas5P());
  172.873 ms (3717 allocations: 6.96 MiB)

julia> @btime solve(oprob, FBDF());
  83.403 ms (1377 allocations: 7.25 MiB)

May 14 '25 18:05 oscardssmith

Oh interesting, we should make it so if you switch but never actually step that you can keep history alive

May 14 '25 18:05 ChrisRackauckas

also @jClugstor in the debug info, we really should be logging when we try to switch algs and whether we're successful in doing so.

May 14 '25 18:05 oscardssmith

unfortunately, it might not be quite this easy. By the time we fail the check a couple times, we've trashed our dt value. In order to make this work, we would probably have to significantly rework the auto-switch logic to do a dry run of the nonstiff alg before switching.

May 14 '25 18:05 oscardssmith

I also wonder if we want to use Rodas5P for somewhat bigger sizes if we have a sparse jacobian provided. Since all the cost of rosenbrock methods is in the factorization, so if we have a sparse matrix, we probably can stay with rosenbrock for longer.

May 14 '25 18:05 oscardssmith

Well sparse matrices will be slow for things like 100x100 lu factorization. So you don't want to do it if too small. But yes, if in that large enough range, if you have a sparse matrix then direct methods will be relatively better than Krylov methods, which ignore that, and so that will increase the cutoff point that we should want.

May 14 '25 18:05 ChrisRackauckas

what I'm saying is that since BDF methods exchange lower order for fewer factorizations, if we have a sparse jacobian, it might make sense to avoid going to FBDF until a larger size than we normally would. Also, as a temporary patch to the AutoSwitch/FBDF friction, how would you feel about bumping our threshold for FBDF methods from 50 to 100?

May 14 '25 19:05 oscardssmith

That's fine. I know some cases where that will be slower, so I think overall it's a wash.

May 14 '25 19:05 ChrisRackauckas

yeah. Part of the issue we're having here is that this problem is just on the edge of stiff. the eigenvalue is only ~1e3, so Tsit5 is actually able to solve this problem (although it takes 2s to do so).

May 14 '25 19:05 oscardssmith

Also @TorkelE, could Catalyst start providing jacobian sparsity?

I am all for a 2x speed-up. This isn't something that MTK can/should determine automatically though? I.e. Catalyst jus tegenrates normal ODESystem and sends to the ODEProblem, so unless there is something special about Catalyst system's where we want to use this option.

Or you mean that it might make sense for these FSPSystems though, I imagine they have a very specific sparse structure to use? That should be fairly straightforward to add to the ODEProblem(::FSPSystem, ...) call.

May 14 '25 21:05 TorkelE

Oh right, MTK should do this if you do sparseprob = ODEProblem(sys,... jac = true, sparse = true)

May 14 '25 21:05 oscardssmith

Only if the size is sufficiently large. Since a smaller sparse matrix would be slower.

May 14 '25 22:05 ChrisRackauckas

And it needs to be "sparse enough"

May 14 '25 22:05 ChrisRackauckas

OrdinaryDiffEq.jl OrdinaryDiffEq.jl copied to clipboard

Default solver selection very slow (x20) compared to `Rodas5P` on a given problem

OrdinaryDiffEq.jl
OrdinaryDiffEq.jl copied to clipboard