Manopt.jl icon indicating copy to clipboard operation
Manopt.jl copied to clipboard

A JuMP tutorial

Open kellertuer opened this issue 7 months ago • 41 comments
trafficstars

closes #313

📋

  • [x] understand how to use @objective
  • [x] write a bit about optimise! and summary.

kellertuer avatar Apr 18 '25 07:04 kellertuer

Codecov Report

:x: Patch coverage is 72.94118% with 23 lines in your changes missing coverage. Please review. :white_check_mark: Project coverage is 99.56%. Comparing base (79f374b) to head (db6341e). :warning: Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
ext/ManoptJuMPExt.jl 81.57% 14 Missing :warning:
ext/ManoptJuMPManifoldsExt.jl 0.00% 9 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #454      +/-   ##
==========================================
- Coverage   99.80%   99.56%   -0.25%     
==========================================
  Files          85       86       +1     
  Lines        9359     9389      +30     
==========================================
+ Hits         9341     9348       +7     
- Misses         18       41      +23     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Apr 18 '25 07:04 codecov[bot]

@benoit – I started again with the tutorial – first the variant I thought was easier – providing the Manopt objective directly.

First I ran into a problem that is maybe just a print/show missing? If I have assigned the model (first REPL thingy) and then just try to look at the model I get

julia> @objective(model, Min, ManifoldGradientObjective(f, grad_f))
ManifoldGradientObjective{AllocatingEvaluation}

julia> model
A JuMP Model
├ solver: Manopt with GradientDescentState
├ objective_sense: MIN_SENSE
Error showing value of type Model:

SYSTEM (REPL): showing an error caused an error
ERROR: MethodError: no method matching jump_function_type(::Model, ::Type{ManoptJuMPExt.RiemannianFunction{ManifoldGradientObjective{…}}})
The function `jump_function_type` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  jump_function_type(::GenericModel{T}, ::Type{MathOptInterface.VectorOfVariables}) where T
   @ JuMP ~/.julia/packages/JuMP/RGIK3/src/variables.jl:753
  jump_function_type(::GenericModel{T}, ::Type{MathOptInterface.VariableIndex}) where T
   @ JuMP ~/.julia/packages/JuMP/RGIK3/src/variables.jl:769
  jump_function_type(::GenericModel{T}, ::Type{MathOptInterface.VectorNonlinearFunction}) where T
   @ JuMP ~/.julia/packages/JuMP/RGIK3/src/nlp_expr.jl:1142
  ...

Stacktrace:
  [1] print_response(errio::IO, response::Any, backend::Union{…}, show_value::Bool, have_color::Bool, specialdisplay::Union{…})
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:446
  [2] (::REPL.var"#70#71"{LineEditREPL, Pair{Any, Bool}, Bool, Bool})(io::Any)
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:405
  [3] with_repl_linfo(f::Any, repl::LineEditREPL)
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:678
  [4] print_response(repl::AbstractREPL, response::Any, show_value::Bool, have_color::Bool)
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:403
  [5] (::REPL.var"#do_respond#100"{…})(s::REPL.LineEdit.MIState, buf::Any, ok::Bool)
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:1035
  [6] (::VSCodeServer.var"#103#106"{REPL.var"#do_respond#100"{…}})(mi::REPL.LineEdit.MIState, buf::IOBuffer, ok::Bool)
    @ VSCodeServer ~/.vscode/extensions/julialang.language-julia-1.127.2/scripts/packages/VSCodeServer/src/repl.jl:122
  [7] #invokelatest#2
    @ ./essentials.jl:1055 [inlined]
  [8] invokelatest
    @ ./essentials.jl:1052 [inlined]
  [9] run_interface(terminal::REPL.Terminals.TextTerminal, m::REPL.LineEdit.ModalInterface, s::REPL.LineEdit.MIState)
    @ REPL.LineEdit ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/LineEdit.jl:2755
 [10] run_frontend(repl::LineEditREPL, backend::REPL.REPLBackendRef)
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:1506
 [11] (::REPL.var"#79#85"{LineEditREPL, REPL.REPLBackendRef})()
    @ REPL ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/REPL/src/REPL.jl:497
Some type information was truncated. Use `show(err)` to see complete types.

That is maybe the area of “nice to have”. But the whole example fails when I try to run it. The code I ended up with is (in an MnWE):

using JuMP, Manopt, Manifolds, Random
using ManifoldDiff: grad_distance, prox_distance
Random.seed!(42);

n = 100
σ = π / 8
M = Sphere(2)
p = 1 / sqrt(2) * [1.0, 0.0, 1.0]
data = [exp(M, p,  σ * rand(M; vector_at=p)) for i in 1:n];

model = Model(Manopt.JuMP_Optimizer)
@variable(model, p[i=1:3] in M, start=p[i])
f(M,p) = sum(1 / (2 * n) * norm.(Ref(p) .- data) .^ 2)
grad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)));
@objective(model, Min, ManifoldGradientObjective(f, grad_f))

set_start_value.(p, data[1])
optimize!(model)

so since the model does not print its state I am not yet sure this works correctly, but this might (if it works) be the first of the two examples I want to do.

kellertuer avatar Apr 25 '25 20:04 kellertuer

At least to finish the day I got a first version running, where a first draft of using a full objective seems to work

https://manoptjl.org/previews/PR454/tutorials/UsingJuMP/

kellertuer avatar Apr 25 '25 21:04 kellertuer

What is the expression of the embedding objective ?

blegat avatar Apr 26 '25 06:04 blegat

The f from the example above (or the one with distances for the first working example, which would be more precise I think).

kellertuer avatar Apr 26 '25 06:04 kellertuer

Then don't use a user-defined function, let JuMP's automatic differentiation do the work:

@objective(model, Min, sum(sum((p - d).^2) for d in data)) / 2 * n)

blegat avatar Apr 26 '25 07:04 blegat

Hm, that is again all a bit tricky, Why do I have to use yours and not

f2(p) = sum(1 / (2 * n) * norm.(Ref(p) .- data) .^ 2)
@objective(model2, Min, f2(p))

? That one errors with not-supported.

edit: And your cost does “not work”

julia> solution_summary(model2)
solution_summary(; result = 1, verbose = false)
├ solver_name          : Manopt with GradientDescentState
├ Termination
│ ├ termination_status : LOCALLY_SOLVED
│ ├ result_count       : 1
│ └ raw_status         : At iteration 200 the algorithm reached its maximal number of iterations (200).
└ Solution (result = 1)
  ├ primal_status        : FEASIBLE_POINT
  ├ dual_status          : NO_SOLUTION
  ├ objective_value      : 2.67372e+01
  └ dual_objective_value : 1.00000e+02

which is a cost value much much larger than the first one. But since all of this is magic to me – e.g. my cost that is denied for me reads the same as yours which is allowed, so I do not know what it magically does inside, but the raw_status indicates, that the algorithm did not converge.

;maybe we should leave that out then? I do not see how I can understand this, since to me JuMP just stays total magic, that sometimes magically works, but for most of my intuitive approaches just throws errors at me :/ (and yes sure, the most probable cause is me being not clever enough).

Even worse – the result is no longer on the sphere, so somewhere in between either the gradient is not the Riemannian Gradient or Manopt is not used.

julia> value.(p)
3-element Vector{Float64}:
 0.5883892547439781
 0.004486488706579396
 0.62161572207423

julia> norm(ans) # On the sphere this would have to be one.
0.855937000957988

I'll submit this for now, think about it tomorrow and probably delete it. I spent now two full days getting something to work and I have to honestly confess – I am not able to understand how JuMP is supposed to work.

edit2: What really looks suuuuper strange is when I define the cost as you say I get

julia> @objective(model2, Min, sum(sum((p - d).^2) for d in data)) / (2 * n)
0.5 p[1]² + 0.5 p[2]² + 0.5 p[3]² - 0.5883892542830678 p[1] - 0.004486488827604947 p[2] - 0.6216157225096294 p[3] + 0.5

so when that is the cost used, it is fully wrong since data is about 100 points on the sphere.

kellertuer avatar Apr 26 '25 10:04 kellertuer

JuMP's AD currently does not support vectors, I'm working on it. This is the first item of our roadmap: https://jump.dev/JuMP.jl/stable/developers/roadmap/#Development-roadmap That's why norm is not supported.

so when that is the cost used, it is fully wrong since data is about 100 points on the sphere.

The objective is a sum of 100 different quadratic expressions. Because these are quadratic expressions, JuMP will just simplify it by actually doing this sum and simplify. If you don't want JuMP to simplify affine and quadratic expressions and want the expression tree before any simplifications, we have https://jump.dev/JuMP.jl/stable/api/JuMP/#@force_nonlinear But I don't think that's causing any issues here.

Here is a reduction of the issue that is completely JuMP-independent.

using Manopt, Manifolds, LinearAlgebra
manifold = Sphere(2)
function eval_f_cb(M, p)
    return sum(sum((p - d).^2) for d in data) / (2 * n)
end
function eval_grad_f_cb(M, p)
    grad_f = 2sum((p - d) for d in data) / (2 * n)
    return Manopt.riemannian_gradient(manifold, p, grad_f)
end
objective = Manopt.ManifoldGradientObjective(eval_f_cb, eval_grad_f_cb)
dmgo = decorate_objective!(manifold, objective)
problem = DefaultManoptProblem(manifold, dmgo)
s = Manopt.GradientDescentState(manifold; p=p0)
state = decorate_state!(s)
solve!(problem, state)
solution = Manopt.get_solver_return(get_objective(problem), state)
norm(solution)

Running this, I get as value of solution:

3-element Vector{Float64}:
 0.5883892594536084
 0.00448648746991528
 0.6216157176252621

which has norm

0.855937000957987

which is indeed not on the sphere. However, as this is pure-Manopt and no JuMP, this is pure magical for me so the floor is yours ;)

blegat avatar Apr 28 '25 10:04 blegat

Ok, thanks for narrowing it down, since it is now a Manopt problem , I will check what is going wrong there then. I was thinking first, that JuMP maybe somehow did not do the gradient conversion, but I had no clue how to dive into that code.

For the pure manopt version, I know how to get more insight and will try to check what is happening there. At first glance, the code looks fine, so maybe the initial data is wrong or the approximation of the original cost (the one with distances on the Manifold) has really to be used and the norm approximation idea can not be used here.

kellertuer avatar Apr 28 '25 11:04 kellertuer

One of the top causes of losing normalization in optimization on a sphere in Manopt is accumulation of numerical errors. The default exponential map isn't particularly stable and this is a common result. Projecting the iterate back into the manifold is usually a good solution but currently figuring out when it needs to be turned on is left to the user.

@kellertuer maybe it would be a good idea to add a check at the end of optimization if the result lies on the manifold and if it doesn't suggest to the user using projection?

mateuszbaran avatar Apr 28 '25 12:04 mateuszbaran

Hm but the Riemannian gradient conversion does the projection on the tangent space (actually only that) so, while this is in general the problem, it might not be here? But I will investigate that a bit.

For the warning, I am not yet sure where I should put that then in a generic JuMP interface

kellertuer avatar Apr 28 '25 13:04 kellertuer

For the warning, I am not yet sure where I should put that then in a generic JuMP interface

This should be changed to return MOI.INFEASIBLE_POINT if the point is not in the manifold: https://github.com/JuliaManifolds/Manopt.jl/blob/013feadf95736639cdedb1bff2c4f1723f7dfe64/ext/ManoptJuMPExt.jl#L462-L468

If you want to give custom message to the user like advice to set a projection, you can just print it like with @warn or you could also include it in the RawStatusString: https://github.com/JuliaManifolds/Manopt.jl/blob/013feadf95736639cdedb1bff2c4f1723f7dfe64/ext/ManoptJuMPExt.jl#L484-L488

blegat avatar Apr 28 '25 13:04 blegat

Hm I have to check and first find out why we are leaving the manifold, I did not expect that at all.

kellertuer avatar Apr 28 '25 13:04 kellertuer

Hm, jeah, the data is completely random and wrong -.-

The current code just stays at the initial point, but that is already not a point on manifold.

kellertuer avatar Apr 28 '25 14:04 kellertuer

The initial point is p0 = 1 / sqrt(2) * [1.0, 0.0, 1.0] which is in the manifold

blegat avatar Apr 28 '25 14:04 blegat

So. Here is a full example with data on the manifold I confused myself with the statement above: one has to copy data[1] to not modify the data for p0, then this code

using JuMP, Manopt, Manifolds, Random
using ManifoldDiff: grad_distance, prox_distance
Random.seed!(42);

n = 100
σ = π / 8
manifold = Sphere(2)
p = 1 / sqrt(2) * [1.0, 0.0, 1.0]
data = [exp(manifold, p, σ * rand(manifold; vector_at=p)) for i in 1:n];

p0 = copy(manifold, data[1]);

function eval_f_cb(M, p)
    return sum(sum((p - d) .^ 2) for d in data) / (2 * n)
end
function eval_grad_f_cb(M, p)
    grad_f = sum((p - d) for d in data) / n
    X = Manopt.riemannian_gradient(M, p, grad_f)
    X2 = project(M, p, grad_f)
    println("EGRad:", grad_f, "\nGrad: ", X , "\n point? ", is_point(M, p; error=:info), " | vector ? ", is_vector(M, p, X, false; error=:warn), "vector2 ? ", is_vector(M, p, X2, false; error=:warn))
    return X2
end
objective = Manopt.ManifoldGradientObjective(eval_f_cb, eval_grad_f_cb)
problem = DefaultManoptProblem(manifold, objective)
s = Manopt.GradientDescentState(manifold; p=p0)
state = decorate_state!(s; debug=[:Iteration, :Cost, :GradientNorm, :Stepsize, "\n"])
solve!(problem, state)
solution = Manopt.get_solver_return(state)
is_point(manifold, solution; error=:info)

Fails even if we use projection (X2) instead of riemannian_gradient – which actually also should and does do the same. That means that the gradient in theory since we project it on the tangent space. It is always a valid Riemannian gradient so that exp alsways should work fine.

However at iteration 9 the gradient is 10e-16 off (no clue why since we project) which accumulates and at iteration 44 it is 10e-10 off and that is enough that the points “wander off”.

I have exactly zero clue why this is happening, the code above prints a lot of information why this is happening, but I am getting tired by now after an hour of looking at results.

We should maybe abandon this idea of providing f if this fails even for the most simple example I can come up with? I have no clue how to fix this. Providing Riemannian cost/grad is not so useful for non-experienced users, but maybe the only thing Manopt currently can offer. If that is too complicated for JuMP, we can also slowly abandon this idea of the extension and not provide it in JuMP. Maybe Manopt is – overall sadly – still not yet of the quality that I had maybe wished it would have reached until now. If we abandon this, I Can also close this PR and all others relayed to JuMP

Sorry for the inconvenience caused.

kellertuer avatar Apr 28 '25 14:04 kellertuer

Hm if we cheat and do not take exp but

s = Manopt.GradientDescentState(manifold; p=p0, retraction_method=ProjectionRetraction())

it works more stably. But that would mean that a user of JuMP has to dive very deep into Manopt details. So my question then would be: Is it useful to continue this?

kellertuer avatar Apr 28 '25 14:04 kellertuer

This is not an issue with the JuMP interface and Manopt works nearly completely fine. You just didn't anticipate the influence of compounding numerical errors. This doesn't expose any fundamental errors. I don't think we should abandon anything. This is even a good pedagogical example of numerical issues, though it might be good to improve default Manopt's robustness.

mateuszbaran avatar Apr 28 '25 15:04 mateuszbaran

Do I understand correctly that the conclusion is that you would recommend using retraction_method=ProjectionRetraction() when used in combination with ManifoldDiff.riemannian_gradient ? If that's the question, we can just choose that retraction method by default if the user provides an objective in the embedding right ?

blegat avatar Apr 28 '25 16:04 blegat

No. That retraction only exists on a few manifolds, as does exp (or in that long form ExponentialRetraction() for the keyword.

The problem is

in theory the exponential map, which is by default used on the sphere is the exact solution one should do as a gradient step (“the thing that replaces + on manifolds”). In practice exp on the sphere is a bit numerically instable. But this is very specific to the sphere. The ProjectionRetraction() in long is “do + and project onto the manifold” instead, which approximates the exponential map (up to first order). In theory it is a bit inexact in all steps. In practice it is numerically more stable.

What one should do for any other manifold is not so easy to say, because similar to here, it also heavily depends on which cost and gradient you take.

kellertuer avatar Apr 28 '25 16:04 kellertuer

We could change the default retraction on sphere so that it does exp and then projects the point for numerical stability. It would still properly reflect the geometry of the sphere while it would be more numerically stable.

mateuszbaran avatar Apr 28 '25 16:04 mateuszbaran

We would need a new retraction type for that probably.

kellertuer avatar Apr 28 '25 16:04 kellertuer

Hm, Now that we in principle have the stabilised exponential map after Mateusz update (thanks!) I again got lost in the docs of JuMP with one simple task

Instead of a s = Manopt.GradientDescentState(manifold; p=p0) I want a s = Manopt.GradientDescentState(manifold; p=p0, retraction_method=ManifoldsBase.StabilizedRetraction(ExponentialRetraction()))`

to use within JuMP. I could not find anything how to change things like that? It seems there is only always the plain GradientDescentState with defaults possible? The only thing I could find is how to maybe exchange the StateType...

kellertuer avatar Apr 30 '25 13:04 kellertuer

Given https://github.com/JuliaManifolds/Manopt.jl/blob/8e754a367bdd1dc6bedbd4a91588e4e7138d953a/ext/ManoptJuMPExt.jl#L374-L377 Maybe try

set_attribute(model, "retraction_method", ManifoldsBase.StabilizedRetraction(ExponentialRetraction())

Let me know if that works

blegat avatar Apr 30 '25 14:04 blegat

Will try. One should maybe document that somewhere if that works, because not everyone is a code-acheologist.

kellertuer avatar Apr 30 '25 14:04 kellertuer

I agree, we could also add a test

blegat avatar Apr 30 '25 14:04 blegat

Hm added that line and even committed it. The summary is, that that line (yes adapted to model2 since it is the second example) does exactly nothing, the second model does not work.

But I tried to find where the state is now hidden to look whether the retraction is set, but I have no clue where that is, it seems to be somewhere very secretly hidden in the model2. I think the set_attribute does not work.

edit: Or to be more precise

  • I do not see what the set actually sets
  • I do not see whether it sets anything
  • I do not see where and when the GradientDescentState is initialised
  • I do not see which retraction it is initialised with

I tried my best, but after a whole afternoon of fighting one line of code – I give up (again). In this speed, the tutorial will take forever. :/ Sorry for that.

kellertuer avatar Apr 30 '25 16:04 kellertuer

I tried it locally and it works. Could you fix it ? What does @show kws gives if you add it after this line: https://github.com/JuliaManifolds/Manopt.jl/blob/1f684e8b61ac2217a93f1bb357679a2fb2acdbfc/ext/ManoptJuMPExt.jl#L376

blegat avatar Apr 30 '25 21:04 blegat

I am not sure it works, I tried a few things I do not understand and then decided to only continue on Friday and left to sit a bit at the Fjord.

The current version https://manoptjl.org/previews/PR454/tutorials/UsingJuMP/ does something different in the embedding or converge differently, but the cost also is a bit different from Variant 1 (geodesic distance) to 2 (norm), so I am not sure. One thing I did change along the way was that the /(2n) division was not inside the objective, but outside /so it divided the returned nothing by 2n). That might have been a larger problem, since that would lead to super large gradients, that would wrap around several times on the sphere and hence merely resemble a random walk.

kellertuer avatar Apr 30 '25 22:04 kellertuer

While I am not sure why it did not work before, I am also not sure why it works now ;)

I will still take some time the next days to check what could best be documented so that the experience is a bit easier for other users and they do not have to dive into the code to see how to set options of the solvers – as well as solvers themselves.

kellertuer avatar May 01 '25 12:05 kellertuer