NLopt.jl
NLopt.jl copied to clipboard
Easier auto-differentiation support?
Right now you need to manually setup auto-differentiation for functions which is more cumbersome than others (e.g. Optim or JuMP directly). Would you be willing to entertain a PR to make this a little easier to work with?
You can see https://github.com/ubcecon/computing_and_datascience/blob/master/julia_tutorials/nlopt/nlopt-tutorial.ipynb for an example of what it might look like. Basically, you could just call an adapter on functions and pass that object directly into your existing functions.
The downside is that this would require adding https://github.com/JuliaNLSolvers/NLSolversBase.jl as a dependency with the following sub-dependencies: https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/REQUIRE
Another idea to throw out there. We have an abstraction for NLP solvers in MathOptInterface with multiple backends, e.g., Ipopt and Knitro. (NLopt has a wrapper for the old MathProgBase API but not yet MathOptInterface.) If you target your adapters to MathOptInterface instead of NLopt directly then you get multiple backends without much extra cost.
@mlubin I really hope that NLopt gets up with the MOI soon so that I can swap out Ipopt when using juMP
Maybe I am misunderstanding, but if you are suggesting that given MOI we would instead use it directly for calling the optimizer, I am not sure I see it as it currently stands. The complexity of MOI is necessary to drive a EDSL like JuMP, but it is very heavy for those of us that just want to call a nonlinear solver and pass in arbitrary functions of vectors. Perhaps JuMP will become a little more friendly for those who want nonlinear solvers without a DSL, or perhaps there is another interface built on MOI for people like me?
Just so you know, the total code required to add Optim-style AD support to NLsolve is just
using NLSolversBase
struct NLoptAdapter{T} <: Function where T <: AbstractObjective
nlsolver_base::T
end
# implement fg!; note that the order is reversed
(adapter::NLoptAdapter)(x, df) = adapter.nlsolver_base.fdf(df, x)
(adapter::NLoptAdapter)(result, x, jacobian_transpose) = adapter.nlsolver_base.fdf(result, jacobian_transpose', x)
# constructors
NLoptAdapter(f, x, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f, x, autodiff = autodiff))
NLoptAdapter(f!, x::Vector, F::Vector, autodiff = :forward) = NLoptAdapter(OnceDifferentiable(f!, x, F, autodiff = autodiff))
Something along these lines seems reasonable to me.
Great @arnavs will prepare a full PR for this with the necessary tests, documentation, etc.
At this point, we think that this will require no direct changes to the NLopt code itself (though you may have ideas of how it could be reorganized). By using NLSolversBase as a dependency, we may be able to hook in other AD implementations if they are added to that library (@arnavs may also try to add Flux.jl or Zygote.jl support to NLSolversBase in a separate issue).
(1) We don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method shadowing the existing ones so it would look like:
# A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M)
f(x) = sum((x .- m).^2) + sum(log.(x))
# define the optimization problem
opt = Opt(:LD_LBFGS, M)
min_objective!(opt, f, autodiff=:forward)
The list of additional methods would be something like
min_objective!(opt::Opt, f::Function, autodiff)
max_objective!(opt::Opt, f::Function, autodiff)
inequality_constraint!(opt::Opt, fc, tol=0, autodiff)
equality_constraint!(opt::Opt, h, tol=0, autodiff)
inequality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff)
equality_constraint!(opt::Opt, c, tol::AbstractVector, autodiff)
where autodiff would support the NLSolversBase symbols in https://github.com/JuliaNLSolvers/NLSolversBase.jl/blob/master/src/NLSolversBase.jl#L55
(i.e. :forward, :finite or :finiteforward, I think?).
(2) Alternatively, the less invasive one is to have a pubically exposed wrapper:
# A setup from our existing tests
M = 5000
x0 = fill(1.0, M)
m = range(2, 5, length = M)
f(x) = sum((x .- m).^2) + sum(log.(x))
# define the optimization problem
opt = Opt(:LD_LBFGS, M)
f_opt = NLoptDifferentiable(f, x0) # The only new code!
min_objective!(opt, f_opt)
For the name of the adapter:
- Optim.jl has some algorithms which use hessians, and hence distinguishing between
OnceDifferentiableandTwiceDifferentiableis necessary. It didn't appear that the Julia interface supported any algorithms with hessians in the objective. Did I miss anything? - If not, then
NLoptDifferentiableor evenDifferentiablewould work (we don't want to collide withNLSolversBase.OnceDifferentiable
An alternative is that we don't add in exported wrapper type and instead add in arguments to the functions themselves? i.e. add a method
@stevengj If you have no strong opinions, I think we should first try to add it directly in the interfaces on those functions instead of wrappers? That would be the easiest for end-users.
@arnavs: A few notes:
- If we try to make new methods for the
min_objective!, etc. , we will need to deal with the fact that we cannot construct aNLSolversBase.OnceDifferentiablewithout having a vector of the right type to pass in.OnceDifferentiableuses it to create a cache, so it is the type that matters rather than the value within the vector - For this, I think you can query the
opt::Optfor the dimension of the objective that was proposed and create azeros(Float64...)to pass in. Thetol::AbstractVectorfor the vector inequality and equality constraints should be useful for determining that Jacobian, and the univariate inequality and equality constraints should not be too difficult to figure out. - The code for creating the
min_objective!andmax_objective!is in this code: https://github.com/JuliaOpt/NLopt.jl/blob/master/src/NLopt.jl#L426 Which is basically a proxy to theccallin the underlying library. You could create themin_objective!etc. which create and call these outside of this sort of fancy@evaland then just call these existing methods - For the unit tests, it is essential to try out all of the variations on the finite-differences and forward differences for the univariate objective, multivariate objective, univariate equality constraint, univariate inequality constraint, multivariate equality, and multivariate inequality. You can construct artifically simple optimization problems for those tests.
This thread is moving in the direction of adding direct support for AD to NLopt which is fine with me. But anyway to address the point on MOI,
Maybe I am misunderstanding, but if you are suggesting that given MOI we would instead use it directly for calling the optimizer, I am not sure I see it as it currently stands. The complexity of MOI is necessary to drive a EDSL like JuMP, but it is very heavy for those of us that just want to call a nonlinear solver and pass in arbitrary functions of vectors.
Yes, I was suggesting to call the optimizers via MOI. Yes, there are a few more lines of code required, but if you're already in the business of designing APIs for nonlinear optimization (as discussed in this thread), then I don't expect it to be a significant challenge. There are a number of people familiar with the MOI API who can answer questions at https://gitter.im/JuliaOpt/JuMP-dev. I also have an old demo notebook of connecting Newton's method to MathProgBase: https://nbviewer.jupyter.org/github/JuliaOpt/juliaopt-notebooks/blob/master/notebooks/MathProgBase%20Newton%20solver.ipynb. The API for interacting with derivatives is mostly unchanged from MPB to MOI.
The main extra things that MOI supports over the Optim/NLopt-type interfaces are:
- Sparse hessian matrices
- Separate handling of linear, quadratic, and other classes of constraints.
I understand that a lot of use cases for nonlinear optimization don't need to think about sparse hessian matrices or linear constraints. Ultimately it's up to you to decide if the extra complexity of the abstraction is worth it.
Perhaps JuMP will become a little more friendly for those who want nonlinear solvers without a DSL, or perhaps there is another interface built on MOI for people like me?
The JuliaSmoothOptimizers organization has built a good amount of infrastructure that could be useful. As far as I know it hasn't been updated for MOI yet though. In any case, the best way for a simpler NLP interface to be built on top of MOI is for someone invested in the result to write the code. It's a question of a couple days, not weeks or months.
but if you're already in the business of designing APIs for nonlinear optimization (as discussed in this thread), That is close to the last thing that I want to do. My only goal here is to hack support into NLopt for AD to get through until there are interfaces that people like me want to use. To me, the gold standard is https://tomopt.com/tomlab/ : pass in a function for the constraints, objectives, matrices (or matrix free operators at some point) for the linear constraints, and settings... essentially DSL-free.
We can discuss further in slack to see if a NLP interface based on MOI is possible and if there are people who could help design it. I am not sure I have access to people with sufficient expertise (BTW, you had previously mentioned JuliaSmoothOptimizers to me, but I never was able to navigate it to see the interface they were proposing).
Rather than min_objective!(opt, f, autodiff=:forward), I would suggest something like min_objective!(opt, autodiff(f)), where autodiff(f) takes a function f(x) and returns an (x, grad) -> ... function of the form expected by nlopt.
Then you are just adding a new autodiff function and don't need to modify the existing functions at all. In principle, the autodiff function (or whatever it is called) could even be in an a separate package, e.g. in NLSolversBase.
something along these lines? (i use this to add forward differenciation support in my code)
function nlopt_form(f,x,g,diffresult_cache)
if length(g) > 0
ForwardDiff.gradient!(diffresult_cache,f,x)
g .= DiffResults.gradient(diffresult_cache)
return DiffResults.value(diffresult_cache)
else
return f(x)
end
end
using that as a base,
function autodiff(f)
res =(x,grad=[]) ->begin
if length(grad) > 0
diffresult_cache = DiffResults.GradientResult(x)
ForwardDiff.gradient!(diffresult_cache,f,x)
grad .= DiffResults.gradient(diffresult_cache)
return DiffResults.value(diffresult_cache)
else
return f(x)
end
end
return res
end
the problem to adding this to NLsolversBase is that the check for the existence of the gradient is different. here checks for length, Optim checks for nothing)
an improved version, works with the future NLsolvers and NLopt
function autodiff(fn)
function f(x)
return fn(x)
end
function f(∇f, x)
if !(∇f == nothing) && (length(∇f) != 0)
ForwardDiff.gradient!(∇f,fn,x)
end
fx = f(x)
return ∇f == nothing ? fx : (fx, ∇f)
end
function f(∇²f, ∇f, x)
if !(∇²f == nothing)
ForwardDiff.hessian!(∇f,fn,x)
end
if (∇f == nothing) && (length(∇f) == 0) && (∇²f == nothing)
fx = f(∇f, x)
return fx
elseif ∇²f == nothing
return fx = f(∇f, x)
else
fx, ∇f = fx = f(∇f, x)
return fx, ∇f, ∇²f
end
end
return f
end
Just as an update here, I found the above to not be working (perhaps it did when posted). Here is what I found to work:
function autodiff(fn)
# adapted from here https://github.com/JuliaOpt/NLopt.jl/issues/128
function f(x)
return fn(x)
end
function f(x,∇f)
if !(∇f == nothing) && (length(∇f) != 0)
ForwardDiff.gradient!(∇f,fn,x)
end
fx = fn(x)
return fx
end
return f
end
I use this in practice:
import AbstractDifferentiation as AD
function gradient_on_demand(ad_backend, f)
function(x, ∇fx)
if isempty(∇fx)
f(x)
else
v, (g,) = AD.value_and_gradient(ad_backend, f, x)
copy!(∇fx, g)
v
end
end
end
Example:
using NLopt, ForwardDiff
ab = AD.ForwardDiffBackend()
opt = Opt(:LD_SLSQP, 3)
equality_constraint!(opt, gradient_on_demand(ab, x -> sum(x) - 3), 1e-4)
opt.min_objective = gradient_on_demand(ab, x -> sum(abs2, x))
(minf, minx, ret) = optimize(opt, [-1.0, 1.5, 9.9])
Would be happy to make a PR, AbstractDifferentiation is a not a heavy dependency (per se). And Enzyme support is WIP.