SciMLSensitivity.jl icon indicating copy to clipboard operation
SciMLSensitivity.jl copied to clipboard

Analytic Discrete Adjoint

Open taylormcd opened this issue 3 years ago • 4 comments

How much might be involved to implement the discrete adjoint analytically (rather than through AD)? I did a derivation, and it seems like a general implementation should be possible with the support of AD. What are your thoughts?

Here's my derivation for reference, let me know if there's something incorrect or missing.

discrete.md

taylormcd avatar Feb 07 '22 21:02 taylormcd

So you want to use the integrator form and define the adjoint via a discrete dynamic process of step!(integrator) calls, and then vector-Jacobian products of step!? I think that can work. Though I would think that differentiating the solver might be nearly identical in performance at that point, since it's the steps of step! that has all of the meat.

Analytic discrete adjoints have been done before (https://epubs.siam.org/doi/10.1137/130912335?mobileUi=0) though I think what really needs to be shown is that writing it out gives a real advantage over AD of the solver, since technically in the end it's exactly the same code I'm a bit skeptical it's worth the effort but maybe with some good hand optimizations it will do better than Zygote/Diffractor. At least the compile time should be less.

Anyways, it's worth a try, but I haven't tried it myself because I'm not the most bullish on it giving something very much worth the cost-benefit.

ChrisRackauckas avatar Feb 07 '22 22:02 ChrisRackauckas

Theoretically AD would generate the same code with the right overloads, but tape based methods don't work for long computations and my understanding is that pervasive Zygote doesn't quite work yet. So the best alternative I can think of is to use continuous sensitivity analysis or the proposed approach. Admittedly, this is not new, but I think that with the current state of AD, a hybrid approach would be faster than pure AD.

On Mon, Feb 7, 2022, 3:05 PM Christopher Rackauckas < @.***> wrote:

So you want to use the integrator form and define the adjoint via a discrete dynamic process of step!(integrator) calls, and then vector-Jacobian products of step!? I think that can work. Though I would think that differentiating the solver might be nearly identical in performance at that point, since it's the steps of step! that has all of the meat.

Analytic discrete adjoints have been done before ( https://epubs.siam.org/doi/10.1137/130912335?mobileUi=0) though I think what really needs to be shown is that writing it out gives a real advantage over AD of the solver, since technically in the end it's exactly the same code I'm a bit skeptical it's worth the effort but maybe with some good hand optimizations it will do better than Zygote/Diffractor. At least the compile time should be less.

Anyways, it's worth a try, but I haven't tried it myself because I'm not the most bullish on it giving something very much worth the cost-benefit.

— Reply to this email directly, view it on GitHub https://github.com/SciML/DiffEqSensitivity.jl/issues/556#issuecomment-1031981178, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEDIQXYT3OQZ3AUZAHZM2PDU2A62FANCNFSM5NYVQDCQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

taylormcd avatar Feb 07 '22 23:02 taylormcd

Maybe, this could be mixed with ModelingToolkit? (In the sense that MTK could potentially generate more optimized vjp code for a certain subset of problems than the AD packages.)

frankschae avatar Feb 08 '22 08:02 frankschae

I don't see MTK having a good use in here, at least with Enzyme around it can be used for the vjps about as effectively.

Zygote just needs to support push! to handle the ODE solvers. @DhairyaLGandhi did that ever get tested?

Maybe there can be overhead reduction.

ChrisRackauckas avatar Feb 09 '22 03:02 ChrisRackauckas