Optimization.jl icon indicating copy to clipboard operation
Optimization.jl copied to clipboard

Choice of second-order AD and warnings

Open gdalle opened this issue 1 year ago • 3 comments

This issue is about the machinery for choosing backends and throwing warnings:

https://github.com/SciML/OptimizationBase.jl/blob/2ffab7e93197c1fc8d9ed6a39857e301a71a474e/src/adtypes.jl#L222-L236

https://github.com/SciML/OptimizationBase.jl/blob/2ffab7e93197c1fc8d9ed6a39857e301a71a474e/src/cache.jl#L45-L58

I think that this could be both optimized and simplified due to recent changes in DI.

Nowadays, DI.inner and DI.outer can also be called on backends which are not SecondOrder, they just act as the identity. Thus, you don't need to explicitly create a SecondOrder(adtype, adtype). Passing adtype alone will be equivalent in most cases, and faster in some because it can leverage custom Hessian implementations within a single backend (e.g. SecondOrder(AutoForwardDiff(), AutoForwardDiff()) cannot call ForwardDiff.hessian whereas AutoForwardDiff() can). Furthermore, DI's hvp and hessian for AutoZygote() already use ForwardDiff-over-Zygote.

Here are my suggestions:

  • Simplify the generate_adtype logic and its variants to avoid creating SecondOrder objects altogether.
  • Throw a warning based on the modes DI.inner and DI.outer, e.g. when the inner backend is not a reverse mode backend. This can be checked with ADTypes.mode(DI.inner(adtype)) isa Union{ADTypes.ReverseMode,ADTypes.ForwardOrReverseMode}. Of course you also want to allow ForwardDiff so feel free to refine.
  • Document this behavior so that users are less confused by the warnings (see this Discourse thread).

What do you think @Vaibhavdixit02?

gdalle avatar Nov 12 '24 10:11 gdalle

This sounds very reasonable, sorry for the long delay in response. I will fix this up.

Vaibhavdixit02 avatar Jan 16 '25 13:01 Vaibhavdixit02

Thank you for this awesome package. Just wanted to say that I still get the following warning when using IPNewton method and AutoForwardDiff.

Warning: The selected optimization algorithm requires second order derivatives, but `SecondOrder` ADtype was not provided.
So a `SecondOrder` with ADTypes.AutoForwardDiff() for both inner and outer will be created, this can be suboptimal and not work in some cases so an explicit `SecondOrder` ADtype is recommended.

curioustolearn avatar May 09 '25 12:05 curioustolearn

Bump, this is still relevant (throws a warning with ForwardDiff) as of Optimization v4.7.0.

ForceBru avatar Sep 15 '25 19:09 ForceBru