HiGHS icon indicating copy to clipboard operation
HiGHS copied to clipboard

highspy: highs_linear_expression.__add__ modifies self

Open few opened this issue 1 year ago • 22 comments

The following code shows that adding to a linear_expression modifies it. This is highly problematic, if the linear expression is reused several times in the code.

h = highspy.Highs()
x = h.addVariable()
linear_expression = 1*x
assert linear_expression.constant == 0
other_linear_expression = linear_expression + 1
assert linear_expression.constant == 0, "Ooops linear_expression has been modified"

few avatar Aug 20 '24 16:08 few

Interesting!

jajhall avatar Aug 20 '24 17:08 jajhall

Should I try to provide a PR? Does anybody have an opinion on a fix? I'm considering this:

  • turn __add__ into __iadd__
  • Make __add__ return a new instance

I'm wondering if this will hurt performance when adding many linear expressions (for example in a sum(...)), I'm wondering if a specialized sum function should be written that works similar to the current __add__.

few avatar Aug 21 '24 18:08 few

Let me flag the author first. What do you think @mathgeekcoder?

jajhall avatar Aug 21 '24 19:08 jajhall

Just chiming in here with my JuMP hat on.

Does anybody have an opinion on a fix?

The highest priority fix is to make this non-modifying and return temporary copies. This severely hurts performance, but the current behavior of modifying the self with __add__ is a recipe for veeeery hard to debug issues in user code.

I'm wondering if this will hurt performance when adding many linear expressions (for example in a sum(...)), I'm wondering if a specialized sum function should be written that works similar to the current

This is the fundamental challenge that algebraic modeling languages try to solve. There are a few approaches:

  • JuMP uses Julia's metaprogramming to rewrite sum(x[i] for in 1:N) from the naive operator overloading to an efficient in-place update
  • Python doesn't have the same meta programming features, so most libraries opt for some sort of gp.quicksum or pulp.lpSum which implements an in-place summation
  • Pyomo just build a full on expression tree and then deal with compressing it later, which is why it has great features for nonlinear and higher level modeling constructs, but poorer performance if you want to build a simple LP.

odow avatar Aug 21 '24 21:08 odow

This is a good comment for discussion. There are 2 main reasons why highs_linear_expression isn't immutable: the ability to express constraints like 5 <= x0 + 2*x1 <= 15, and performance.

If it were immutable we'd need to write 5 <= (x0 + 2*x1 <= 15) or (5 <= x0 + 2*x1) <= 15 instead - it doesn't work without the brackets, which is confusing for users and a little ugly. There might be a fix, but I didn't see an obvious solution at the time.

The performance of the __add__ is a concern with large sum(x). It's possible to write a specialized sum, but it'd still be slow for other data structures, e.g., consider the numpy code below, which uses its own diagonal().sum().

N = 8
h = highspy.Highs()
x = np.reshape(h.addBinaries(N*N), (N, N))

h.addConstrs(sum(x[i,:]) == 1 for i in range(N))    # each row has exactly one queen
h.addConstrs(sum(x[:,j]) == 1 for j in range(N))    # each col has exactly one queen

y = np.fliplr(x)
h.addConstrs(x.diagonal(k).sum() <= 1 for k in range(-N + 1, N))   # each diagonal has at most one queen
h.addConstrs(y.diagonal(k).sum() <= 1 for k in range(-N + 1, N))   # each 'reverse' diagonal has at most one queen

That said, I'm curious about your use case? If you want explicit immutability, is the following suitable for you?

import highspy
from highspy.highs import highs_linear_expression

h = highspy.Highs()
x = h.addVariable()
linear_expression = 1*x
assert linear_expression.constant == 0
other_linear_expression = highs_linear_expression(linear_expression) + 1
assert linear_expression.constant == 0, "does not assert anymore"

mathgeekcoder avatar Aug 21 '24 21:08 mathgeekcoder

I'm curious about your use case? If you want explicit immutability

This is a solution. But the main issue is that the defaults are the wrong way round:

  • By default, novice users get opted in to setup where they can easily make mistakes which are very hard to debug, and harder still to even notice that they have made a mistake
  • More advanced users can opt-out by wrapping expressions in highs_linear_expression to get new copies.

gurobipy, pulp, etc have all thought through this decision as well, and there's a reason they have their own specialized summations.

odow avatar Aug 21 '24 21:08 odow

I understand the benefits of immutability, but I'm still curious about the use case. I presume it's when users want to keep the expression around and try to build on top of it?

My intent was that the user would never keep an explicit highs_linear_expression object reference. That is, they are meant to be transitory, constructed at the time when adding the constraints.

With this mindset, I'm curious if there are still simple examples where the user can easily make mistakes?

mathgeekcoder avatar Aug 21 '24 21:08 mathgeekcoder

My intent was that the user would never keep an explicit highs_linear_expression object reference

Oh, but they do. All the time.

JuMP syntax, since I'm not familiar with highspy, but models like this are super common:

model = Model()
@variable(model, thermal_generators[1:3] >= 0)
@variable(model, renewable_generators[1:3] >= 0)
thermal_generation = sum(thermal_generators)
renewable_generation = sum(renewable_generators)
@constraint(model, thermal_generation + renewable_generation == demand)
@constraint(model, thermal_generation <= emission_limit)

Here the thermal_generation in the last constraint would actually be thermal_generation + renewable_generation because of the in-place modification in the penultimate constraint.

odow avatar Aug 21 '24 22:08 odow

Thanks for the example. I don't personally like that style of usage, but I can see it's a valid consideration.

Changing highspy to be immutable is trivial, except for the 5 <= x0 + 2*x1 <= 15 style syntax. I'll have another look into it.

mathgeekcoder avatar Aug 22 '24 01:08 mathgeekcoder

I've got a workaround for the 5 <= x0 + 2*x1 <= 15 style syntax. I'll clean it up and make a PR tomorrow.

mathgeekcoder avatar Aug 22 '24 02:08 mathgeekcoder

Maybe you should have __add__ for operations like x1 + x2 <= 10 and __iadd__ with a custom sum function so you can do sum(var[i] * coef[i] for i in range(N)) <= 10. I have implemented this approach here and I believe PuLP does the same.

guifcoelho avatar Aug 22 '24 19:08 guifcoelho

Maybe you should have __add__ for operations like x1 + x2 <= 10 and __iadd__

Oh absolutely. I've already written this support when I made the other changes for "immutability". The main difficultly was still supporting "comparison chaining" in a safe and clean way.

Comparison chaining support isn't strictly necessary, but it's elegant syntax if you need both lb and ub for a constraint. AFAIK most other solvers don't natively support lb/ub bounds and instead use multiple constraints or auxiliary variables (e.g., Gurobi addRange). Note, I've also added support the expr == [lb,ub], but it's not quite as intuitive as lb <= expr <= ub.

If anyone is interested, in python lb <= expr <= ub is interpreted as bool(lb <= expr) and (expr <= ub). Since the expressions are now immutable, this would simply return expr <= ub and the lb <= expr is thrown away. I believe with a little trickery behind the scenes it's possible to safely 'reinterpret' this is (lb <= expr) <= ub, which returns what you would expect.

mathgeekcoder avatar Aug 22 '24 20:08 mathgeekcoder

Hum.. I did some research and it might not be possible. As you said, Python might not evaluate all conditions. But let us know if you find any way around it.

Depending on how you compute the comparisons you could do some thing like this:

(x1 + x2 + x3).within(5, 10)

You will have to transform x1 + x2 + x3 into a linear expression object, and implement the within( . ) method in the linear expression class. The within( . ) method should then return two constraints to be added to the model.

guifcoelho avatar Aug 23 '24 13:08 guifcoelho

Thanks for also looking into it! I've got the typical use cases to work, but it's not the most elegant implementation (see PR).

I override __bool__ to capture the potential chain and return True to force evaluation for any remaining comparisons. I use a shared object to capture the chain so that each highs_linear_expression remains immutable (unless explicitly calling __iadd__). This shared object uses thread local storage so that it's all thread safe.

Your within syntax can be easily supported, it's similar to the gurobipy expr == [lb,ub], which I supported in the PR. Though I'm hesitant to add this method, since we'd probably need to add many similar methods for consistency.

Note: highs needs only one constraint for the lb <= expr <= ub. It's different (some might argue better!) to the other solvers in this regard.

One remaining item is how to provide meaningful errors if the user does something that seems reasonable at first glance - but can't be expressed as a single constraint. e.g., x0 <= x1 <= 4. Now, x0 <= x1 and x1 <= 4 is reasonable, but I don't believe we can write that logic as a single linear constraint.

I'd prefer to avoid converting these to multiple constraints, as I'm sure the logic can get very complex! Similarly, I'd prefer to avoid interpreting expressions like expr1 or expr2 <= 1.

mathgeekcoder avatar Aug 23 '24 16:08 mathgeekcoder

Maybe a bit unrelated, but since you're working on the constraints... I started to convert a modeling program to use highspy and ran into this error: Cannot construct constraint with variables as bounds. What the program does is to construct such a constraint: expr <= t where expr is a highs_linear_expression and t is a highs_var. This works in other modeling tools and feels quite natural to write to me.

few avatar Aug 24 '24 06:08 few

Have you considered creating two classes? One for sums of variables multiplied by constants and one for constraints? Initially I thought one could do something like this:

h = highspy.Highs()
x = h.addVariable()
y = h.addVariable()
linear_expression_x = (1 <= x) <= 2
print(linear_expression_x.LHS, linear_expression_x.RHS)
linear_expression_y = (3 <= y) <= 7
print(linear_expression_y.LHS, linear_expression_y.RHS)
linear_expression = linear_expression_x + linear_expression_y
print(linear_expression.LHS, linear_expression.RHS)

What should linear_expression be? I'd have said 4 <= x + y <= 9, but it turns out 3 <= x + y <= 2.

While I don't have a use case for this in my modeling, I still find it confusing. Maybe this and other problems go away when the two use cases are expressed with two separate classes. linear_expression.__add__ would then return another linear_expression and __le__ would return a constraint. And constraint wouldn't have an __add__.

few avatar Aug 24 '24 07:08 few

Good spotting @few! Yes, I noticed similar edge cases when testing the new code for chained comparison. I'm most of the way through a major refactoring of the inequality logic + many updated tests. It includes the cases you mentioned above.

Once complete, I believe the new behavior will be more intuitive, with better error handling for unsupported cases.

mathgeekcoder avatar Aug 26 '24 23:08 mathgeekcoder

In fact, this is the reason why PyOptInterface creates a new ExprBuilder class to use in-place mutable operations to build expressions efficiently: https://metab0t.github.io/PyOptInterface/expression.html

It is much faster than ordinary linear expressions in practice.

ExprBuilder uses hash map internally to allow fast addition/removal of variables and change their coefficients very fast. When it is used to add a constraint, we flatten it to two arrays of coefficients and indices of variables, then pass them to the solver.

metab0t avatar Aug 27 '24 08:08 metab0t

Thanks @metab0t. I just tried PyOptInterface, it seems very fast!

Though, running your nqueens example with N > 10 crashes for me. N <= 10 works great, except it only works once (crashes on 2nd time) - which makes it hard for me to compare timings. Note the crash is in optimize() after highs seems to have finished.

Might just be my system, but I couldn't give it a scale test. I was curious how it performed against calling the direct addRow calls to highs. That's one nice thing about highspy, if performance is a concern, you can always make direct calls to the highs API - for slightly less syntactic sugar.

mathgeekcoder avatar Aug 27 '24 18:08 mathgeekcoder

@mathgeekcoder Sorry that I cannot reproduce the crash on my machine.

This is the benchmark script I use to test the model construction time:

Benchmark script
import numpy as np
import pyoptinterface as poi
from pyoptinterface import highs
import highspy

import time


def test_poi(N):
    model = highs.Model()

    x = np.empty((N, N), dtype=object)
    for i in range(N):
        for j in range(N):
            x[i, j] = model.add_variable(domain=poi.VariableDomain.Binary)

    for i in range(N):
        # Row and column
        model.add_linear_constraint(poi.quicksum(x[i, :]), poi.Eq, 1.0)
        model.add_linear_constraint(poi.quicksum(x[:, i]), poi.Eq, 1.0)
    for i in range(-N + 1, N):
        # Diagonal
        model.add_linear_constraint(poi.quicksum(x.diagonal(i)), poi.Leq, 1.0)
        # Anti-diagonal
        model.add_linear_constraint(
            poi.quicksum(np.fliplr(x).diagonal(i)), poi.Leq, 1.0
        )

    # model.optimize()


def test_highspy(N):
    h = highspy.Highs()
    x = np.reshape([h.addBinary() for _ in range(N * N)], (N, N))

    for i in range(N):
        h.addConstr(sum(x[i, :]) == 1)  # each row has exactly one queen
        h.addConstr(sum(x[:, i]) == 1)  # each col has exactly one queen

    y = np.fliplr(x)
    for k in range(-N + 1, N):
        h.addConstr(sum(x.diagonal(k)) <= 1)
        h.addConstr(sum(y.diagonal(k)) <= 1)

    # h.solve()


def main(Ns=range(100, 900, 100)):
    poi_timings = []
    highspy_timings = []

    for N in Ns:
        print(f"N = {N}")
        start = time.time()
        test_poi(N)
        poi_timings.append(time.time() - start)
        start = time.time()
        test_highspy(N)
        highspy_timings.append(time.time() - start)

    for N, poi_time, highspy_time in zip(Ns, poi_timings, highspy_timings):
        print(f"N = {N}")
        print(f"POI: {poi_time:.2f}s")
        print(f"Highs: {highspy_time:.2f}s")


if __name__ == "__main__":
    main()

The result is:

Benchmark result
N = 100
POI: 0.04s
Highs: 0.36s
N = 200
POI: 0.17s
Highs: 1.42s
N = 300
POI: 0.36s
Highs: 3.27s
N = 400
POI: 0.75s
Highs: 7.76s
N = 500
POI: 2.33s
Highs: 15.86s
N = 600
POI: 1.54s
Highs: 13.06s
N = 700
POI: 2.15s
Highs: 18.43s
N = 800
POI: 2.80s
Highs: 24.29s

PyOptInterface uses Highs_addRow internally to implement add_variable and the performance overhead is quite small.

metab0t avatar Aug 28 '24 09:08 metab0t

Thanks @metab0t! Yeah, it's probably just my machine. BTW: the crash only happens if I call optimize - it builds the model just fine. I wasn't sure if POI did some caching behind the scenes and only updated the model on the optimize call.

I appreciate the benchmarks! BTW: The latest highspy (I'm currently working on) has some additional syntax and much better performance than the released version. Also note the "direct" highs API performance is hard to beat! See updated benchmarks below.

Benchmark script
import timeit
import highspy
import numpy as np
import pyoptinterface as poi
from pyoptinterface import highs
from functools import partial

def test_highspy(N):
    h = highspy.Highs()
    h.silent()

    x = np.reshape(h.addBinaries(N*N), (N, N))
    y = np.fliplr(x)

    h.addConstrs(h.qsum(x[i,:]) == 1 for i in range(N))    # each row has exactly one queen
    h.addConstrs(h.qsum(x[:,j]) == 1 for j in range(N))    # each col has exactly one queen

    h.addConstrs(h.qsum(x.diagonal(k)) <= 1 for k in range(-N + 1, N))   # each diagonal has at most one queen
    h.addConstrs(h.qsum(y.diagonal(k)) <= 1 for k in range(-N + 1, N))   # each 'reverse' diagonal has at most one queen

    return h

def test_highs(N):
    h = highspy.Highs()
    h.silent()

    NxN = N*N
    x = np.arange(NxN, dtype=np.int32).reshape(N, N)

    zero, ones = np.zeros(NxN), np.ones(NxN)
    h.addCols(NxN, zero, zero, ones, 0, [], [], [])
    h.changeColsIntegrality(NxN, x, np.full(NxN, highspy.HighsVarType.kInteger, dtype=np.int32))

    for i in range(N):
        h.addRow(1, 1, N, np.array(x[i,:]), ones)    # each row has exactly one queen
        h.addRow(1, 1, N, np.array(x[:,i]), ones)    # each col has exactly one queen

    y = np.fliplr(x)
    for k in range(-N + 1, N):
        h.addRow(-h.inf, 1, min(k+N,N-k), np.array(x.diagonal(k)), ones)   # each diagonal has at most one queen
        h.addRow(-h.inf, 1, min(k+N,N-k), np.array(y.diagonal(k)), ones)   # each 'reverse' diagonal has at most one queen

    return h

def test_poi(N):
    model = highs.Model()
    model.set_raw_parameter('output_flag', False)

    x = np.empty((N, N), dtype=object)
    for i in range(N):
        for j in range(N):
            x[i, j] = model.add_variable(domain=poi.VariableDomain.Binary)

    for i in range(N):
        model.add_linear_constraint(poi.quicksum(x[i, :]), poi.Eq, 1.0)
        model.add_linear_constraint(poi.quicksum(x[:, i]), poi.Eq, 1.0)

    y = np.fliplr(x)
    for i in range(-N + 1, N):
        model.add_linear_constraint(poi.quicksum(x.diagonal(i)), poi.Leq, 1.0)
        model.add_linear_constraint(poi.quicksum(y.diagonal(i)), poi.Leq, 1.0)


for N in range(100, 900, 100):
    print(N,end='|')
    print(f"{np.average(timeit.Timer(partial(test_poi,     N)).repeat(50, 1)):.3f}", end='|')
    print(f"{np.average(timeit.Timer(partial(test_highspy, N)).repeat(50, 1)):.3f}", end='|')
    print(f"{np.average(timeit.Timer(partial(test_highs,   N)).repeat(50, 1)):.3f}", end='|\n')

FYI: I'm not claiming highspy is faster than POI in general, but seems that way for large nqueen instances.

N POI highspy highs
100 0.024 0.028 0.003
200 0.091 0.092 0.009
300 0.207 0.191 0.018
400 0.369 0.344 0.035
500 0.591 0.522 0.050
600 0.862 0.748 0.067
700 1.173 1.006 0.086
800 1.588 1.339 0.129

mathgeekcoder avatar Aug 29 '24 01:08 mathgeekcoder

@mathgeekcoder Thanks for your benchmark and improvement of highspy! PyOptInterface does not use caching and just translates each step to direct call of highs.

The performance of POI is a little slower because the hashmap-based expression cannot beat the flat representation used by highspy. However, it still has some advantages by merging terms automatically, which is especially useful when you have many terms for the same variable in an expression. Array-based flat representation and hash map representation have their different performance characteristics.

metab0t avatar Aug 29 '24 02:08 metab0t

Closed by #1942

jajhall avatar Sep 26 '24 09:09 jajhall