SciMLSensitivity.jl icon indicating copy to clipboard operation
SciMLSensitivity.jl copied to clipboard

[WIP] Generalizing ForwardSensitivity

Open ArnoStrouwen opened this issue 2 years ago • 2 comments

using OrdinaryDiffEq
using DiffEqSensitivity
using ForwardDiff

function f(du, u, p, t)
    du[1] = dx = p[1] * u[1] - p[2] * u[1] * u[2]
    du[2] = dy = -p[3] * u[2] + u[1] * u[2]
end
u0 = [1.0,1.0]
p = [1.5,1.0,3.0]
du0 = [1.0 0.0; 0.0 2.0]
dp = [0.0 0.0;1.0 0.0; 0.0 0.0]
tspan = (0.0,10.0)

prob = ODEForwardSensitivityProblem(f,u0,tspan,p,ForwardSensitivity(;autojacvec=false,autojacmat=true);du0=du0,dp=dp);
sol = solve(prob,Tsit5(),reltol=1e-14)
println(reshape(sol(5.0)[3:end],2,2))

function g(θ)
    u0 = [θ[1],2*θ[2]]
    p = [1.5,θ[1],3.0]
    prob = ODEProblem(f,u0,tspan,p);
    sol = solve(prob,Tsit5(),reltol=1e-14)
    sol(5.0)
end
θ = [1.0, 0.5]
g(θ)

println(ForwardDiff.jacobian(g,θ))
[2.3628870872641823 -3.3084810220732983; 2.0379280636888324 -0.4906018446509108]
[2.362887087264832 -3.3084810220732117; 2.0379280636881245 -0.49060184465068685]

A prototype of a more general ForwardSensitivity transform. Includes taking sensitivities of initial conditions and subsets of parameters. Initial conditions and parameters can be jointly perturbed in some direction.

I need advise on how to merge this with existing functionality. Currently I do everything with jacobian matrix products. Jacobian vector products could similarly be implemented. Fully formed jacobians (user supplied or AD) will be inefficient with sensitivities of subsets of parameters, but might still be added such that all the old options are still available.

The key idea is that the user needs to supply du0 and dp. If dp is set to the identity matrix of dimension length(p) and du0 is set to the zero matrix of dimension length(u0),length(p) then the previous functionality is recovered, so this might be a sensible default option. The performance in that case should also be the same.

du0 and dp can be plugged into concrete_solve to make ForwardSensitivity work just as well as adjoint options?

ArnoStrouwen avatar Jan 01 '22 02:01 ArnoStrouwen

There's a lot to mention here. First of all, as https://aip.scitation.org/doi/10.1063/5.0060697 describes, chunking can be beneficial to performance when cost scales more than linearly, which is the case for stiff ODE solvers. So a form that is able to compute on chunks at a time would be beneficial for improving the performance, and making use of that from the solve adjoint dispatches would be a good option to add (this has already been done for the ForwardDiff version).

Currently I do everything with jacobian matrix products. Jacobian vector products could similarly be implemented. Fully formed jacobians (user supplied or AD) will be inefficient with sensitivities of subsets of parameters, but might still be added such that all the old options are still available.

The previous dispatches already handles all of the cases there I thought?

du0 and dp can be plugged into concrete_solve to make ForwardSensitivity work just as well as adjoint options?

yeah, it should just slot in right here: https://github.com/SciML/DiffEqSensitivity.jl/blob/master/src/concrete_solve.jl#L308-L310. I'm not sure what dp is for though.

As for subsets of parameters, https://github.com/SciML/ModelingToolkit.jl/issues/1088 is where that's going.

ChrisRackauckas avatar Jan 02 '22 13:01 ChrisRackauckas

chunking can be beneficial to performance when cost scales more than linearly, which is the case for stiff ODE solvers.

I can add chunking. How it is currently implemented in ForwardSensitivity(), you can chose between either one at at a time (jacvec) or all together (jacmat). I can do the same for this PR, but anything between those two extremes is more work. Also, stiff solvers are not working well with making the jacobian of the extended system, you have to use finite differences currently. This is because these functions don't play nice with AD: https://github.com/SciML/DiffEqSensitivity.jl/blob/b5be7279e4bf831778ccc81cc6614b5ffd777e0b/src/forward_sensitivity.jl#L88

The previous dispatches already handles all of the cases there I thought?

It does, but it relies heavily on the assumption that the number of sensitivities you calculate is equal to the number of parameters, to set up correct dimensions of arrays. I generalized only one branch of the dispatch to see if it works, but I will do the rest.

I'm not sure what dp is for though.

You need something like that to be able to handle perturbing θ where u0 = [θ[1],2*θ[2]]; p = [1.5,θ[1],3.0] You can also use it to define subsets of parameters using zero rows. But this is somewhat inefficient maybe because you then end up with a bunch of dual numbers with zeros as partials.

As for subsets of parameters, SciML/ModelingToolkit.jl#1088 is where that's going.

Great, that was the use case I am building this for anyway, more specifically to put it in DiffEqUncertainty for robust control.

ArnoStrouwen avatar Jan 02 '22 22:01 ArnoStrouwen