pulp icon indicating copy to clipboard operation
pulp copied to clipboard

PuLP incorrectly interprets binary variables as continuous variables in some .mps files

Open vladimirvalenta opened this issue 1 year ago • 11 comments

System: MacBook (INTEL CPU), Ventura 13.2, Python 3.8 PuLP 2.7.0

When calling either HiGHS or CBC via PuLP on the MipLib2017 benchmark problems, (at least) two of them seemed to be misinterpreted. Although they are MIP problems, PuLP strips the information and passes the LP relaxation to HiGHS (or CBC).

Consider the MIP problem labeled physiciansched6-2 - attached. Both HiGHS and CBC reports 43,012.50 objective value but the correct objective value is 49,324.00. At further examination, both solvers report this problem is LP but it is not. In fact, when HiGHS is called on this problem from the command line, it correctly recognizes it as MIP and arrives with the proper solution.

The physiciansched6-2 problem has binary variables.

from pulp import *
timeLimit = 600
var, prob = LpProblem.fromMPS(modelName)
prob.solve(HiGHS_CMD(msg=1, options=["--time_limit", str(timeLimit)]))
highsObjective = prob.objective.value()

physiciansched6-2.mps.zip

vladimirvalenta avatar Mar 06 '23 18:03 vladimirvalenta

Did you check the solution status?

Stuart Mitchell PhD Engineering Science Extraordinary Freelance Programmer and Optimisation Guru www.stuartmitchell.com

On Tue, Mar 7, 2023 at 10:34 AM vladimirvalenta @.***> wrote:

I am pretty sure I found what triggers the issue - I scanned all the MipLib2017 problems and the ones that exhibit this linear relaxation problem have an .mps file that contains, under BOUNDS, both BV variables and FX variables at the same time.

— Reply to this email directly, view it on GitHub https://github.com/coin-or/pulp/issues/634#issuecomment-1457034709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFUIWNT7RTQD3BMREBNPQ3W2ZJ4RANCNFSM6AAAAAAVRPTGSY . You are receiving this because you are subscribed to this thread.Message ID: @.***>

stumitchell avatar Mar 06 '23 21:03 stumitchell

@stumitchell Yeah, it says it is optimal. See the log from HiGHS below when called via PuLP. Notice the problem is identified as LP.

Running HiGHS 1.4.0 [date: 1970-01-01, git hash: bcf6c0b22]                    
Copyright (c) 2022 ERGO-Code under MIT licence terms
LP   b989347b399e4895b9e4e452b9670c35-pulp has 168336 rows; 111827 cols; 480259 nonzeros
Presolving model
60724 rows, 29990 cols, 167060 nonzeros
40106 rows, 26100 cols, 115113 nonzeros
31955 rows, 18425 cols, 98461 nonzeros
27033 rows, 13545 cols, 88603 nonzeros
26991 rows, 13541 cols, 88519 nonzeros
Presolve : Reductions: rows 26991(-141345); columns 13541(-98286); elements 88519(-391740)
Solving the presolved LP
Using EKK dual simplex solver - serial
  Iteration        Objective     Infeasibilities num(sum)
          0     0.0000000000e+00 Ph1: 0(0) 0s
      11416     4.1008567996e+04 Pr: 2770(75598.9); Du: 0(1.93079e-06) 5s
      18392     4.3012500000e+04 Pr: 0(0) 9s
Solving the original LP from the solution after postsolve
Model   status      : Optimal
Simplex   iterations: 18392
Objective value     :  4.3012500000e+04
HiGHS run time      :          9.21

vladimirvalenta avatar Mar 06 '23 21:03 vladimirvalenta

I looked at the PuLP code. Some info (mostly from CPLEX documentation):

There are two commonly used ways of extending the MPS file format to include integer variables: in the COLUMNS section or in the BOUNDS section.

In the first way, integer variables are identified within the COLUMNS section of the MPS file by marker lines. A marker line is placed at the beginning and end of a range of integer variables.  ......

In the second way of treating integer variables, integer variables are declared in the BOUNDS section with special bound types 

The issue is that PuLP only identifies integer variables via the first way, within the COLUMNS section, and it ignores if they are indicated within the BOUNDS section.

There are three different types as far as I can tell that indicate a variable is an integer or boolean: BV, LI, and UI. PuLP completely ignores LI and UI (I will not deal with those two in this issue). It recognizes BV as a variable type that has lb = 0 and ub = 1 but it will not set it to integer - so if that variable was not treated in the COLUMNS section, it will be defined as a continuous variable <0, 1>

Adding one line in readMPSSetBounds in mps_lp.py - setting the type to integer - seems to fix the issue.

if bound == "FR":
        set_both_bounds(None, None)
        return
    elif bound == "BV":
        set_both_bounds(0, 1)
# set the type to "Integer"
        variable_dict[var_name]["cat"] = COL_EQUIV[True]
        return

vladimirvalenta avatar Mar 07 '23 18:03 vladimirvalenta

So is this just waiting for a PR to enable?

danielolsen avatar May 19 '23 16:05 danielolsen

Yup a PR would be great @vladimirvalenta would you like to do one and add some tests as well?

stumitchell avatar May 21 '23 20:05 stumitchell

I suppose I could do it. About the test - I can attach the one from MIPLIB2017. I dont want to be creating .MPS file by hand.

vladimirvalenta avatar May 23 '23 12:05 vladimirvalenta

A couple of other things I also noticed when looking at the bound-setting logic:

  • In addition to interpreting BV within the bounds, UI bounds should also be interpreted as integral, with a specified upper bound.
  • Blank lines cause an exception to be thrown, since at l.60 the line variable is an empty list which we try to index into. FICO Xpress sometimes includes empty lines in its MPS export logic. Adding a length-check and continue between l.58 & l.60 would fix this I believe.

If there's a MIPLIB test case that would also check for one or both of these conditions being interpreted properly, that would be two or three birds killed with one stone.

danielolsen avatar May 23 '23 21:05 danielolsen

My personal preference would be smaller easily understood unit tests. I.e. a sample mps file that only defines one variable for example. This would make debugging easier than including a MIPLIB file and finding a solution, that would be an integration test not a unit test. @pchtsp do you have a preference?

stumitchell avatar May 23 '23 22:05 stumitchell

Yep, I agree. If you can just include the minimal string of an MPS file inside the unit-test and pass it as a StringIO or something to trick pulp into reading it, even better.

For information, we were trying to adapt pulp to use the python package: https://github.com/pchtsp/pysmps Correctly importing this package (which has a better implementation of an MPS reader in theory) may already solve this issue. There were other changes, improvements (and bug corrections) that needed to be done in that package to be able to import it easily plug it into pulp. But do take a look at that package because it may already solve your problem.

pchtsp avatar May 24 '23 20:05 pchtsp

I am traveling until end of June. Will get to that then.

vladimirvalenta avatar May 31 '23 17:05 vladimirvalenta

@pchtsp is there a need for a PR or have we adapted the pysmps package?

vladimirvalenta avatar Aug 28 '23 02:08 vladimirvalenta