Accessing objective's gradient for MOI (and others) in the solution type
While thinking about #8 and #148 in order to finalize the output structure, it is often very useful to know the gradient of the objective (and the constraint jacobian) at the solution. I think this means that you would want to have the gradients in the solution type (which I think is in https://github.com/SciML/SciMLBase.jl/blob/master/src/solutions/optimization_solutions.jl ?)
In MOI this is done with the
MOI.eval_objective_gradient
call, I believe, which you could use to fill it in You can see https://github.com/jump-dev/Ipopt.jl/blob/master/src/MOI_wrapper.jl for an example of its implementation which would show how it is called. For things like JuMP, then call these functions as required to fill in their own problem structures.
An aside: for the AD rules here, all of those gradients and lagrange multipliers are essential to keep around in one form or another - so having them in the structure might have the ChainRules definitions a little bit easier to implement. An example is implementing the envelope condition to provide the AD rule for constrained optimization.
I think this request is part of a more general issue that there may be all sorts of things in the return type from optimizers which you want to standardize access to over time.
One approach is to have lots of fields which may be nothing, and potentially start subclassing it (e.g. OptimizationSolution, ConstrainedOptimizationSolution, ComplementarityConstrainedOptimizationSolution, etc. as required). For what it is worth, the MOI crew decided not to have a specific structure for this and instead implement functions and use dispatching. But the goals of the MOI are very different (ie.. it is focused on providing the modeling language features to support things like JuMP as its main goal). I am not sure how well that approach would work for AD rules, so not suggesting it here.
Anyways, not suggesting any particular design, just that these are all going to come - especially as you add in more on constrained optimization. All of the numbers start mattering, and if the values are lost then people might have to go back to using the raw optimizers without GalacticOptim.
The dual as well becomes important for many cases as the values of the lagrange multipliers have important interpretation in many cases.
An aside: for the AD rules here, all of those gradients and lagrange multipliers are essential to keep around in one form or another - so having them in the structure might have the ChainRules definitions a little bit easier to implement. An example is implementing the envelope condition to provide the AD rule for constrained optimization.
It wouldn't effect that. It would be in the closure.
Lots of fields which may be nothing is the better way here.
Perfect. Well, to start things off, lets start getting the gradients of the objectve in there were possible! MOI is a good one to start.
The main difference is that we don't have a solution type. Solutions are stored inside the model object which is owned by each package/solution algorithm (e.g, Cbc.Optimizer, Optim.Optimizer, ...). Then that type overloads the appropriate getters for different solution attributes. It also allows optimizers to overload solver-specific solution attributes.
JuMP has a SolutionSummary object for simple problems and is mainly intended as a print debugging tool.
https://github.com/jump-dev/JuMP.jl/blob/6901e8f98bf24242a141e003e42530fa90f33c3c/src/print.jl#L414-L436
but it's a pain to build because you have try-catch for "do you support this attribute?"
https://github.com/jump-dev/JuMP.jl/blob/6901e8f98bf24242a141e003e42530fa90f33c3c/src/print.jl#L474-L477
Having solution methods subtype some AbstractSolutionAlgorithm type, and then overload get for different attributes is really the way to go. It's a pain that MOI is big and complicated and doesn't support first-class nonlinear. But it's really the way to go to abstract over a wide range of solvers and problem types.