Sundials.jl icon indicating copy to clipboard operation
Sundials.jl copied to clipboard

Support for AD and Jacobian sparsity

Open YingboMa opened this issue 4 years ago • 11 comments

YingboMa avatar Jun 30 '21 17:06 YingboMa

Is it still a priority now that we have QNDF? Someone (anyone) can do it, but I don't think it's as high of a priority now.

ChrisRackauckas avatar Jun 30 '21 17:06 ChrisRackauckas

Yeah, probably won't be a priority once we have FBDF (FLC BDF). Currently, CVODE_BDF still wins in a few HVAC systems.

YingboMa avatar Jun 30 '21 17:06 YingboMa

This is something that we are interested in benchmarking the power systems stuff. We could take care of this. cc. @rodrigomha

jd-lara avatar Jun 30 '21 19:06 jd-lara

All of the tools for doing it are in https://github.com/SciML/OrdinaryDiffEq.jl/blob/v5.59.2/src/derivative_wrappers.jl, where you make it able to use AD or not to define Jacobians and then in Sundials you'd just register it in the way it's already doing a Jacobian overload if a .jac exists:

https://github.com/SciML/Sundials.jl/blob/v4.5.1/src/common_interface/solve.jl#L319-L338

So then, now we would always use that path, but if .jac exists we'd first build a Jacobian function from that AD/FiniteDiff.jl stuff and then register that.

ChrisRackauckas avatar Jun 30 '21 22:06 ChrisRackauckas

For my problem (reaction transport) CVOD_BDF with banded solver is still way faster than QNDF+Bandedmatrices. CVODE takes a few minutes while I don't have the patience to wait QNDF to finish. I have a feeling that QNDF doesn't work well with discontinuity. In my code I have many things like "if oversaturate then precipitate minerals". If I disable such reactions then QNDF is faster than CVODE_BDF. Otherwise it is very slow. I have now used smoothed Heaviside function to replace these discontinuities, but still QNDF doesn't work well. I also wonder if the interval banded solver in Sundials is just better than Bandedmatrices. I use MKL.

JianghuiDu avatar Sep 29 '21 11:09 JianghuiDu

Would BandedMatrices use MKL? Im not sure it does. if it is still on OpenBlas, then just a flat loop (which Sundials does) is open a ton faster than OpenBLAS. I really think the answer is to get Loop vectorization stuff into BM.jl

ChrisRackauckas avatar Oct 01 '21 12:10 ChrisRackauckas

It lowers to LAPACK/BLAS calls so use MKL if that's what Julia is compiled with

dlfivefifty avatar Oct 01 '21 22:10 dlfivefifty

Try doing @profile... a comment "this is way faster than that" is not particularly helpful... often there's some obscure type inference issue underlying things.

Banded solvers themselves are pretty simple so all close to optimal: you might get a max 2-4x difference in speed by changing OpenBLAS for MKL, or LU for QR, but not enough to go from "a few minutes" to "too long to wait".

dlfivefifty avatar Oct 01 '21 22:10 dlfivefifty

Yes, share an example. Last major issue with Sundials turned out to be the linear solvers, which when we moved away from OpenBLAS to pure Julia tools we went from like 5x slower to 2x faster. That's why a priori I would think that, well the only difference here is the Jacobian type so something similar could be happening here. But without an MWE and a profile it's just guessing. It could also be that something transforms into dense somewhere.

ChrisRackauckas avatar Oct 02 '21 14:10 ChrisRackauckas

As far as I know CVODE also uses Lapack banded solver so this shouldn't be different from BandedMatrices. Maybe the problem is QNDF? My model is here. 2800 equations, upper bandwth = 55 and lower bandwth =33. https://github.com/JianghuiDu/tests/blob/96d9d8da7ffe6f0dfcf9c7020f3c73eb4f3d20f3/model.ipynb Comparing CVODE_BDF banded with QNDF+ BandedMatrices (finite difference).

tspan = (0.0, 15000.0);
prob = ODEProblem(reactran_fvcf, C_uni, tspan, nothing);
cb = TerminateSteadyState(1e-8, 1e-6, DiffEqCallbacks.allDerivPass);

@time sol = DifferentialEquations.solve(
    prob,
    CVODE_BDF(linear_solver = :LapackBand, jac_upper = Upbdwth, jac_lower = Lwbdwth),
    reltol = 1e-6,
    abstol = 1e-15,
    save_everystep = false,
    callback = cb,
    saveat = 50.0,
    tstops = 5.0,
    # dtmax = 0.01,
    maxiters = Int(1e6),
)

with

using BandedMatrices
jac_prototype = BandedMatrix(Ones(Ngrid*nspec,Ngrid*nspec),(Lwbdwth,Upbdwth));
reactran_band = ODEFunction(reactran_fvcf,jac_prototype=jac_prototype);

tspan = (0.0, 15000.0);
prob_band = ODEProblem(reactran_band, C_uni, tspan, nothing);
cb = TerminateSteadyState(1e-8, 1e-6, DiffEqCallbacks.allDerivPass);

@time sol = DifferentialEquations.solve(
    prob_band,
    QNDF(autodiff=false),
#     QNDF(autodiff=true,chunk_size=chunk_size), using AD
    reltol = 1e-6,
    abstol = 1e-15,
    save_everystep = false,
    callback = cb,
    saveat = 50.0,
    tstops = 5.0,
    # dtmax = 0.01,
    maxiters = Int(1e6),
)

On my laptop CVODE takes about 2 minutes while QNDF is too slow to finish. Julia version 1.6.3 + MKL.

JianghuiDu avatar Oct 03 '21 11:10 JianghuiDu

Thanks, isolated to a QNDF issue. https://github.com/SciML/OrdinaryDiffEq.jl/issues/1496

ChrisRackauckas avatar Oct 03 '21 12:10 ChrisRackauckas