JuMP.jl icon indicating copy to clipboard operation
JuMP.jl copied to clipboard

Should we simplify NonLinearExpr similar to AffExpr or QuadExpr?

Open mitchphillipson opened this issue 1 year ago • 9 comments

Explanation of Issue

I'll start this with an example of the issue

julia> using JuMP;
julia> A = [:a,:b];
julia> B = [:c,:d,:e];
julia> p = JuMP.Containers.DenseAxisArray([1 0 3; 0 5 0], A,B);
julia> σ = JuMP.Containers.DenseAxisArray([.5, .75, .25], B);
julia> M = Model();
julia> @variable(M, x[A,B]);
julia> @expression(M, test[a=A], sum(x[a,b]^2*p[a,b] for b∈B));
julia> @expression(M, test_nl[a=A], sum(x[a,b]^(σ[b])*p[a,b] for b∈B));

The issue is the difference between

julia> test
1-dimensional DenseAxisArray{QuadExpr,1,...} with index sets:
    Dimension 1, [:a, :b]
And data, a 2-element Vector{QuadExpr}:
 x[a,c]² + 3 x[a,e]²
 5 x[b,d]²

and

julia> test_nl
1-dimensional DenseAxisArray{NonlinearExpr,1,...} with index sets:
    Dimension 1, [:a, :b]
And data, a 2-element Vector{NonlinearExpr}:
 ((x[a,c] ^ 0.5) * 1.0) + ((x[a,d] ^ 0.75) * 0.0) + ((x[a,e] ^ 0.25) * 3.0)
 ((x[b,c] ^ 0.5) * 0.0) + ((x[b,d] ^ 0.75) * 5.0) + ((x[b,e] ^ 0.25) * 0.0)

Notice both test and test_nl differ only in powers, but test drops any term multiplied by 0 whereas test_nl does not. This is deeper than just the @show method, as can be seen here:

julia> test[:b].terms
OrderedCollections.OrderedDict{UnorderedPair{VariableRef}, Float64} with 1 entry:
  UnorderedPair{VariableRef}(x[b,d], x[b,d]) => 5.0
julia> test_nl[:b].args
3-element Vector{Any}:
 (x[b,c] ^ 0.5) * 0.0
 (x[b,d] ^ 0.75) * 5.0
 (x[b,e] ^ 0.25) * 0.0

The zero terms no longer exist in test.

I typically work with large, sparse matrices meaning there are many zero terms. Dropping these zero terms reduces the model complexity and agrees with what one would expect.

This may also speed up the evaluator in the situation, as an extreme example, where you have an $n\times n$ diagonal matrix and a constraint of the form $Ax^m = b$. If $m=1,2$ there are $n$ evaluations, otherwise there are currently $n^2$.

Solution Suggestion

My suggestion to fix this is to implement drop_zeros! on NonLinearExpr. I'm guessing that is the culprit. This is more complicated on a NonLinearExpr as it requires recursing the leaves of a tree.

One compounding issue

In general NonLinearExpr don't simplify

julia> M = Model();

julia> @variable(M,x);

julia> sum(x+i for i∈1:4)
4 x + 10

julia> sum(x^5+i for i∈1:4)
((((x ^ 5) + 1.0) + ((x ^ 5) + 2.0)) + ((x ^ 5) + 3.0)) + ((x ^ 5) + 4.0)

This is related to the above, but more general. This is something to keep in mind when working on the above, but implementing this may be non-trivial.

*edit: flatten! gets close to fixing the above

julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0

but not quite perfect.

mitchphillipson avatar Sep 11 '23 14:09 mitchphillipson

I was hesitant to make too many changes to the expression graph. For example, which of the following is desired?

julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0

julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 10.0

julia> flatten!(sum(x^5+i for i∈1:4))
4 * (x ^ 5) + 10.0

But I think it is reasonable to drop any + 0.0 or * 0.0 terms. Other changes...I'm not so sure.

odow avatar Sep 11 '23 21:09 odow

’flatten!’ does a decent job of simplifying as is. Obviously, I'd prefer the reverse order of the examples you gave, but can see how issues would arise.

Dropping zeros would be sufficient, at least for the moment, and if I need more I'll write and propose a ’simplify’ function or something similar.

mitchphillipson avatar Sep 11 '23 22:09 mitchphillipson

We had a discussion here: https://github.com/jump-dev/JuMP.jl/pull/3502

One issue that came up is users might expect us to simplify more than just dropping 0, but that's a much larger design challenge.

If you are reading this issue and have an example in which simplifying would make a difference, please post it. If we get enough requests, we may revisit this problem.

odow avatar Sep 12 '23 21:09 odow

Discussed with @ccoffrin and @robbybp. Conclusion is that we shouldn't do this by default. But we should consider it in the context of larger work, such as a nonlinear presolve algorithm.

odow avatar Sep 14 '23 22:09 odow

@mcosovic suggested the drop_zeros! thing as well in #3510, so I should take another look at this.

odow avatar Sep 15 '23 22:09 odow

Just to weigh in on this:

Currently, I have the problem that (x-0.1)^2 + sin(x) seems to be rewritten as x*x-0.2*x+0.01+sin(x), most likely because a quadratic expression is built first. In the context of global optimization, the way the expression is written can matter greatly because of interval arithmetic. I think is quite easy to ruin the intended formulation with obvious reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.

Downsite avatar Mar 25 '24 16:03 Downsite

I think is quite easy to ruin the intended formulation with obvious reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.

Yes, I think this is our conclusion too.

most likely because a quadratic expression is built first

This happens for any quadratic terms, because they get represented as QuadExpr subexpressions instead of NonlinearExpr:

julia> model = Model();

julia> @variable(model, x)
x

julia> @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + (x² - 0.2 x + 0.010000000000000002)

Because of the way that JuMP parses the macros, there is no easy way to override this behavior.

odow avatar Mar 25 '24 19:03 odow

This happens for any quadratic terms, because they get represented as QuadExpr subexpressions instead of NonlinearExpr:

julia> model = Model();

julia> @variable(model, x)
x

julia> @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + (x² - 0.2 x + 0.010000000000000002)

Because of the way that JuMP parses the macros, there is no easy way to override this behavior.

My current workaround is adding a dummy variable fixed to zero and formulate

 @expression(model, sin(x) + (dummy^3 + x - 0.1)^2)

Downsite avatar Mar 26 '24 09:03 Downsite

@Downsite we added @force_nonlinear in #3732, so you can write:

julia> model = Model(); @variable(model, x);

julia> @expression(model, @force_nonlinear sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)

julia> @expression(model, sin(x) + @force_nonlinear((x - 0.1)^2))
sin(x) + ((x - 0.1) ^ 2)

julia> @force_nonlinear @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)

For the main issue, I think our answer to this is "no". Adding simplification rules by default in JuMP just introduces the risk of problems. We want the nonlinear expressions to be passed to the solver pretty much as-is.

In the future, we could consider adding a NonlinearPresolve.Optimizer layer, that users could opt-in to.

odow avatar May 02 '24 22:05 odow

I'm tempted to close this. I don't know if there's anything actionable left to do here.

odow avatar May 29 '24 03:05 odow

Closing as won't-fix for now. We can always revisit this in the future, but I think the first step is to develop some sort of MOI nonlinear presolve layer to test ideas. This doesn't need to happen in JuMP.jl, and it doesn't need to be one of the core contributors, so if you are reading this comment and are interested, feel free to explore with some ideas in a new repository :smile:

odow avatar Jun 20 '24 22:06 odow