pulp
pulp copied to clipboard
PuLP incorrectly interprets binary variables as continuous variables in some .mps files
System: MacBook (INTEL CPU), Ventura 13.2, Python 3.8 PuLP 2.7.0
When calling either HiGHS or CBC via PuLP on the MipLib2017 benchmark problems, (at least) two of them seemed to be misinterpreted. Although they are MIP problems, PuLP strips the information and passes the LP relaxation to HiGHS (or CBC).
Consider the MIP problem labeled physiciansched6-2 - attached. Both HiGHS and CBC reports 43,012.50 objective value but the correct objective value is 49,324.00. At further examination, both solvers report this problem is LP but it is not. In fact, when HiGHS is called on this problem from the command line, it correctly recognizes it as MIP and arrives with the proper solution.
The physiciansched6-2 problem has binary variables.
from pulp import *
timeLimit = 600
var, prob = LpProblem.fromMPS(modelName)
prob.solve(HiGHS_CMD(msg=1, options=["--time_limit", str(timeLimit)]))
highsObjective = prob.objective.value()
Did you check the solution status?
Stuart Mitchell PhD Engineering Science Extraordinary Freelance Programmer and Optimisation Guru www.stuartmitchell.com
On Tue, Mar 7, 2023 at 10:34 AM vladimirvalenta @.***> wrote:
I am pretty sure I found what triggers the issue - I scanned all the MipLib2017 problems and the ones that exhibit this linear relaxation problem have an .mps file that contains, under BOUNDS, both BV variables and FX variables at the same time.
— Reply to this email directly, view it on GitHub https://github.com/coin-or/pulp/issues/634#issuecomment-1457034709, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFUIWNT7RTQD3BMREBNPQ3W2ZJ4RANCNFSM6AAAAAAVRPTGSY . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@stumitchell Yeah, it says it is optimal. See the log from HiGHS below when called via PuLP. Notice the problem is identified as LP.
Running HiGHS 1.4.0 [date: 1970-01-01, git hash: bcf6c0b22]
Copyright (c) 2022 ERGO-Code under MIT licence terms
LP b989347b399e4895b9e4e452b9670c35-pulp has 168336 rows; 111827 cols; 480259 nonzeros
Presolving model
60724 rows, 29990 cols, 167060 nonzeros
40106 rows, 26100 cols, 115113 nonzeros
31955 rows, 18425 cols, 98461 nonzeros
27033 rows, 13545 cols, 88603 nonzeros
26991 rows, 13541 cols, 88519 nonzeros
Presolve : Reductions: rows 26991(-141345); columns 13541(-98286); elements 88519(-391740)
Solving the presolved LP
Using EKK dual simplex solver - serial
Iteration Objective Infeasibilities num(sum)
0 0.0000000000e+00 Ph1: 0(0) 0s
11416 4.1008567996e+04 Pr: 2770(75598.9); Du: 0(1.93079e-06) 5s
18392 4.3012500000e+04 Pr: 0(0) 9s
Solving the original LP from the solution after postsolve
Model status : Optimal
Simplex iterations: 18392
Objective value : 4.3012500000e+04
HiGHS run time : 9.21
I looked at the PuLP code. Some info (mostly from CPLEX documentation):
There are two commonly used ways of extending the MPS file format to include integer variables: in the COLUMNS section or in the BOUNDS section.
In the first way, integer variables are identified within the COLUMNS section of the MPS file by marker lines. A marker line is placed at the beginning and end of a range of integer variables. ......
In the second way of treating integer variables, integer variables are declared in the BOUNDS section with special bound types
The issue is that PuLP only identifies integer variables via the first way, within the COLUMNS section, and it ignores if they are indicated within the BOUNDS section.
There are three different types as far as I can tell that indicate a variable is an integer or boolean: BV, LI, and UI. PuLP completely ignores LI and UI (I will not deal with those two in this issue). It recognizes BV as a variable type that has lb = 0 and ub = 1 but it will not set it to integer - so if that variable was not treated in the COLUMNS section, it will be defined as a continuous variable <0, 1>
Adding one line in readMPSSetBounds
in mps_lp.py
- setting the type to integer - seems to fix the issue.
if bound == "FR":
set_both_bounds(None, None)
return
elif bound == "BV":
set_both_bounds(0, 1)
# set the type to "Integer"
variable_dict[var_name]["cat"] = COL_EQUIV[True]
return
So is this just waiting for a PR to enable?
Yup a PR would be great @vladimirvalenta would you like to do one and add some tests as well?
I suppose I could do it. About the test - I can attach the one from MIPLIB2017. I dont want to be creating .MPS file by hand.
A couple of other things I also noticed when looking at the bound-setting logic:
- In addition to interpreting
BV
within the bounds,UI
bounds should also be interpreted as integral, with a specified upper bound. - Blank lines cause an exception to be thrown, since at l.60 the
line
variable is an empty list which we try to index into. FICO Xpress sometimes includes empty lines in its MPS export logic. Adding a length-check andcontinue
between l.58 & l.60 would fix this I believe.
If there's a MIPLIB test case that would also check for one or both of these conditions being interpreted properly, that would be two or three birds killed with one stone.
My personal preference would be smaller easily understood unit tests. I.e. a sample mps file that only defines one variable for example. This would make debugging easier than including a MIPLIB file and finding a solution, that would be an integration test not a unit test. @pchtsp do you have a preference?
Yep, I agree. If you can just include the minimal string of an MPS file inside the unit-test and pass it as a StringIO or something to trick pulp into reading it, even better.
For information, we were trying to adapt pulp to use the python package: https://github.com/pchtsp/pysmps Correctly importing this package (which has a better implementation of an MPS reader in theory) may already solve this issue. There were other changes, improvements (and bug corrections) that needed to be done in that package to be able to import it easily plug it into pulp. But do take a look at that package because it may already solve your problem.
I am traveling until end of June. Will get to that then.
@pchtsp is there a need for a PR or have we adapted the pysmps package?