JuMP.jl
JuMP.jl copied to clipboard
Should we simplify NonLinearExpr similar to AffExpr or QuadExpr?
Explanation of Issue
I'll start this with an example of the issue
julia> using JuMP;
julia> A = [:a,:b];
julia> B = [:c,:d,:e];
julia> p = JuMP.Containers.DenseAxisArray([1 0 3; 0 5 0], A,B);
julia> σ = JuMP.Containers.DenseAxisArray([.5, .75, .25], B);
julia> M = Model();
julia> @variable(M, x[A,B]);
julia> @expression(M, test[a=A], sum(x[a,b]^2*p[a,b] for b∈B));
julia> @expression(M, test_nl[a=A], sum(x[a,b]^(σ[b])*p[a,b] for b∈B));
The issue is the difference between
julia> test
1-dimensional DenseAxisArray{QuadExpr,1,...} with index sets:
Dimension 1, [:a, :b]
And data, a 2-element Vector{QuadExpr}:
x[a,c]² + 3 x[a,e]²
5 x[b,d]²
and
julia> test_nl
1-dimensional DenseAxisArray{NonlinearExpr,1,...} with index sets:
Dimension 1, [:a, :b]
And data, a 2-element Vector{NonlinearExpr}:
((x[a,c] ^ 0.5) * 1.0) + ((x[a,d] ^ 0.75) * 0.0) + ((x[a,e] ^ 0.25) * 3.0)
((x[b,c] ^ 0.5) * 0.0) + ((x[b,d] ^ 0.75) * 5.0) + ((x[b,e] ^ 0.25) * 0.0)
Notice both test
and test_nl
differ only in powers, but test
drops any term multiplied by 0 whereas test_nl
does not. This is deeper than just the @show
method, as can be seen here:
julia> test[:b].terms
OrderedCollections.OrderedDict{UnorderedPair{VariableRef}, Float64} with 1 entry:
UnorderedPair{VariableRef}(x[b,d], x[b,d]) => 5.0
julia> test_nl[:b].args
3-element Vector{Any}:
(x[b,c] ^ 0.5) * 0.0
(x[b,d] ^ 0.75) * 5.0
(x[b,e] ^ 0.25) * 0.0
The zero terms no longer exist in test
.
I typically work with large, sparse matrices meaning there are many zero terms. Dropping these zero terms reduces the model complexity and agrees with what one would expect.
This may also speed up the evaluator in the situation, as an extreme example, where you have an $n\times n$ diagonal matrix and a constraint of the form $Ax^m = b$. If $m=1,2$ there are $n$ evaluations, otherwise there are currently $n^2$.
Solution Suggestion
My suggestion to fix this is to implement drop_zeros!
on NonLinearExpr. I'm guessing that is the culprit. This is more complicated on a NonLinearExpr as it requires recursing the leaves of a tree.
One compounding issue
In general NonLinearExpr don't simplify
julia> M = Model();
julia> @variable(M,x);
julia> sum(x+i for i∈1:4)
4 x + 10
julia> sum(x^5+i for i∈1:4)
((((x ^ 5) + 1.0) + ((x ^ 5) + 2.0)) + ((x ^ 5) + 3.0)) + ((x ^ 5) + 4.0)
This is related to the above, but more general. This is something to keep in mind when working on the above, but implementing this may be non-trivial.
*edit: flatten!
gets close to fixing the above
julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0
but not quite perfect.
I was hesitant to make too many changes to the expression graph. For example, which of the following is desired?
julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 1.0 + 2.0 + 3.0 + 4.0
julia> flatten!(sum(x^5+i for i∈1:4))
(x ^ 5) + (x ^ 5) + (x ^ 5) + (x ^ 5) + 10.0
julia> flatten!(sum(x^5+i for i∈1:4))
4 * (x ^ 5) + 10.0
But I think it is reasonable to drop any + 0.0
or * 0.0
terms. Other changes...I'm not so sure.
’flatten!’ does a decent job of simplifying as is. Obviously, I'd prefer the reverse order of the examples you gave, but can see how issues would arise.
Dropping zeros would be sufficient, at least for the moment, and if I need more I'll write and propose a ’simplify’ function or something similar.
We had a discussion here: https://github.com/jump-dev/JuMP.jl/pull/3502
One issue that came up is users might expect us to simplify more than just dropping 0
, but that's a much larger design challenge.
If you are reading this issue and have an example in which simplifying would make a difference, please post it. If we get enough requests, we may revisit this problem.
Discussed with @ccoffrin and @robbybp. Conclusion is that we shouldn't do this by default. But we should consider it in the context of larger work, such as a nonlinear presolve algorithm.
@mcosovic suggested the drop_zeros!
thing as well in #3510, so I should take another look at this.
Just to weigh in on this:
Currently, I have the problem that (x-0.1)^2 + sin(x)
seems to be rewritten as x*x-0.2*x+0.01+sin(x)
, most likely because a quadratic expression is built first. In the context of global optimization, the way the expression is written can matter greatly because of interval arithmetic. I think is quite easy to ruin the intended formulation with obvious
reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.
I think is quite easy to ruin the intended formulation with obvious reformulations and I would recommend not doing them automatically unless tools to configure or disable this reformulation are also in place.
Yes, I think this is our conclusion too.
most likely because a quadratic expression is built first
This happens for any quadratic terms, because they get represented as QuadExpr
subexpressions instead of NonlinearExpr
:
julia> model = Model();
julia> @variable(model, x)
x
julia> @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + (x² - 0.2 x + 0.010000000000000002)
Because of the way that JuMP parses the macros, there is no easy way to override this behavior.
This happens for any quadratic terms, because they get represented as
QuadExpr
subexpressions instead ofNonlinearExpr
:julia> model = Model(); julia> @variable(model, x) x julia> @expression(model, sin(x) + (x - 0.1)^2) sin(x) + (x² - 0.2 x + 0.010000000000000002)
Because of the way that JuMP parses the macros, there is no easy way to override this behavior.
My current workaround is adding a dummy variable fixed to zero and formulate
@expression(model, sin(x) + (dummy^3 + x - 0.1)^2)
@Downsite we added @force_nonlinear
in #3732, so you can write:
julia> model = Model(); @variable(model, x);
julia> @expression(model, @force_nonlinear sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)
julia> @expression(model, sin(x) + @force_nonlinear((x - 0.1)^2))
sin(x) + ((x - 0.1) ^ 2)
julia> @force_nonlinear @expression(model, sin(x) + (x - 0.1)^2)
sin(x) + ((x - 0.1) ^ 2)
For the main issue, I think our answer to this is "no". Adding simplification rules by default in JuMP just introduces the risk of problems. We want the nonlinear expressions to be passed to the solver pretty much as-is.
In the future, we could consider adding a NonlinearPresolve.Optimizer
layer, that users could opt-in to.
I'm tempted to close this. I don't know if there's anything actionable left to do here.
Closing as won't-fix for now. We can always revisit this in the future, but I think the first step is to develop some sort of MOI nonlinear presolve layer to test ideas. This doesn't need to happen in JuMP.jl, and it doesn't need to be one of the core contributors, so if you are reading this comment and are interested, feel free to explore with some ideas in a new repository :smile: