ADNLPModels.jl icon indicating copy to clipboard operation
ADNLPModels.jl copied to clipboard

Add new backends with DifferentiationInterface.jl

Open amontoison opened this issue 1 year ago • 12 comments

Add the following backends:

  • Enzyme
  • Zygote
  • Mooncake
  • Diffractor
  • Tracker
  • Symbolics
  • ChainRules
  • FastDifferentiation
  • FiniteDiff
  • FiniteDifferences
  • PolyesterForwardDiff

amontoison avatar Sep 11 '24 05:09 amontoison

Package name latest stable
CaNNOLeS.jl
DCISolver.jl
DerivativeFreeSolvers.jl
JSOSolvers.jl
NLPModelsIpopt.jl
OptimalControl.jl
OptimizationProblems.jl
Percival.jl
QuadraticModels.jl
SolverBenchmark.jl
SolverTools.jl

github-actions[bot] avatar Sep 11 '24 06:09 github-actions[bot]

It's only one, so basically with this change we would no longer be able to use Hessian for Enzyme and Zygote.

We can but only for unconstrained problems. I wanted to remove what was not working before.

The user will no longer be able to use an incorrect Hessian, which is better for everyone.

amontoison avatar Sep 13 '24 05:09 amontoison

@gdalle May I ask you to check what I did wrong in the file di.jl? I have different errors with buildkite: https://buildkite.com/julialang/adnlpmodels-dot-jl/builds/243

amontoison avatar Sep 26 '24 02:09 amontoison

It looks like the problem comes from forgetting to import the function grad? Not a DI thing, presumably an NLPModels thing

gdalle avatar Sep 26 '24 05:09 gdalle

@dpo could you perhaps give me acces to the repo so that I may help with this and future PRs?

gdalle avatar Sep 26 '24 06:09 gdalle

@gdalle I invited you. Thank you for your work here!!!

dpo avatar Sep 26 '24 14:09 dpo

@amontoison what do you think about moving away from symbols here?

gdalle avatar Oct 01 '24 09:10 gdalle

@amontoison what do you think about moving away from symbols here?

It depends on the alternatives, Right now, it's useful to specify that we want optimized backends with :optimized or only matrix-free backends :matrix_free (no Jacobian or Hessian). But if Enzyme.jl is stable enough, we could drop :optimized and use a boolean for matrix-free backends.

It will be easier to provide an AutoBackend() with the appropriate options.

amontoison avatar Oct 01 '24 15:10 amontoison

If I'm not mistaken there are two levels here:

  • the interface you present to the user (:optimized, :matrix_free)
  • the way you represent the backends internally

Right now you base all of the internal representations on Symbols. But as explained here, the whole reason for ADTypes was to move beyond Symbols towards full-fledged types that are 1) more expressive and 2) dispatchable. That's why I was suggesting a similar move here. It doesn't stop you from offering :optimized autodiff options in the front end if you like

gdalle avatar Oct 01 '24 15:10 gdalle

Do you have an example of what you suggest?

amontoison avatar Oct 01 '24 15:10 amontoison

I could try to show you in an alternative PR

gdalle avatar Oct 01 '24 15:10 gdalle

Okay it is a bit hard to submit a PR since there would be a lot of things to rewrite and I don't understand what each part does. But essentially I was imagining something like this:

using ADTypes
using DifferentiationInterface
using LinearAlgebra
using SparseMatrixColorings
using SparseConnectivityTracer
import ForwardDiff, ReverseDiff

function DefaultAutoSparse(backend::AbstractADType)
    return AutoSparse(
        backend;
        sparsity_detector=TracerSparsityDetector(),
        coloring_algorithm=GreedyColoringAlgorithm(),
    )
end

struct ADModelBackend
    gradient_backend
    hprod_backend
    jprod_backend
    jtprod_backend
    jacobian_backend
    hessian_backend
end

struct ADModelBackendPrep
    gradient_prep
    hprod_prep
    jprod_prep
    jtprod_prep
    jacobian_prep
    hessian_prep
end

function ADModelBackend(forward_backend::AbstractADType, reverse_backend::AbstractADType)
    @assert ADTypes.mode(forward_backend) isa
        Union{ADTypes.ForwardMode,ADTypes.ForwardOrReverseMode}
    @assert ADTypes.mode(reverse_backend) isa
        Union{ADTypes.ReverseMode,ADTypes.ForwardOrReverseMode}

    gradient_backend = reverse_backend
    hprod_backend = SecondOrder(forward_backend, reverse_backend)
    jprod_backend = forward_backend
    jtprod_backend = reverse_backend
    jacobian_backend = DefaultAutoSparse(forward_backend)  # or a size-dependent heuristic
    hessian_backend = DefaultAutoSparse(SecondOrder(forward_backend, reverse_backend))

    return ADModelBackend(
        gradient_backend,
        hprod_backend,
        jprod_backend,
        jtprod_backend,
        jacobian_backend,
        hessian_backend,
    )
end

function ADModelBackendPrep(
    admodel_backend::ADModelBackend,
    obj::Function,
    cons::Function,
    lag::Function,
    x::AbstractVector,
)
    (;
        gradient_backend,
        hprod_backend,
        jprod_backend,
        jtprod_backend,
        jacobian_backend,
        hessian_backend,
    ) = admodel_backend

    c = cons(x)
    λ = similar(c)

    dx = similar(x)
    dc = similar(c)

    gradient_prep = prepare_gradient(lag, gradient_backend, x, Constant(λ))
    hprod_prep = prepare_hvp(lag, hprod_backend, x, (dx,), Constant(λ))
    jprod_prep = prepare_pushforward(cons, jprod_backend, x, (dx,))
    jtprod_prep = prepare_pullback(cons, jtprod_backend, x, (dc,))
    jacobian_prep = prepare_jacobian(cons, jacobian_backend, x)
    hessian_prep = prepare_hessian(lag, hessian_backend, x, Constant(λ))

    return ADModelBackendPrep(
        gradient_prep, hprod_prep, jprod_prep, jtprod_prep, jacobian_prep, hessian_prep
    )
end

admodel_backend = ADModelBackend(AutoForwardDiff(), AutoReverseDiff())

obj(x) = sum(x)
cons(x) = abs.(x)
lag(x, λ) = obj(x) + dot(λ, cons(x))

admodel_backend_prep = ADModelBackendPrep(admodel_backend, obj, cons, lag, rand(3));

gdalle avatar Oct 01 '24 16:10 gdalle