Inexplicable performance variation
Hello,
I'm really struggling to pinpoint this issue. When I call HiGHS solver from my application, for one specific (very simple!) problem, the optimisation time is much higher than expected. Yet if I try to debug or reproduce it, it works fine. Perhaps you could help me see where the issue is (…or not).
This is on Windows 11, calling via C# interface, but for debugging I call in the same conditions (from what I can tell!).
The input .mps file attached: too-slow-2023-12-11.zip
I've set log_dev_level to 3, but comparing the two log files I can't see any difference between one an another case (apart from the time, of course).
Here's the “slow” log:
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Running with 8 thread(s)
Solving model: LP-problem
Presolving model
19 rows, 59 cols, 103 nonzeros
HPresolve::removeDependentEquations Got 7 equations, checking for dependent equations
HPresolve::removeDependentEquations Removed 0 rows and 0 nonzeros
15 rows, 45 cols, 82 nonzeros
presolve_.run() returns status: Reduced
Presolve : Reductions: rows 15(-6); columns 45(-16); elements 82(-29)
Solving the presolved LP
Using EKK parallel dual simplex solver - PAMI with concurrency of 8
Cost perturbation for LP-problem
Initially have 45 nonzero costs (100%) with min / average / max = 0.4 / 73.9265 / 256
Large so set max_abs_cost = sqrt(sqrt(max_abs_cost)) = 4
Perturbation column base = 2e-06
Perturbation row base = 1e-12
dual-phase-1-start
Iteration Objective Infeasibilities num(sum)
DuPh1 0 -1.1799937799e+00 Ph1: 1(1); Du: 1(1.17999) No reason
Iteration Objective
DuPh1 1 0.0000000000e+00
Iteration Objective Infeasibilities num(sum)
DuPh1 1 0.0000000000e+00 Ph1: 0(0) Possibly optimal
dual-phase-1-optimal
dual-phase-2-start
Iteration Objective Infeasibilities num(sum)
DuPh2 1 -5.9881191554e+07 Pr: 15(3.39494e+06) No reason
Iteration Objective
DuPh2 2 -5.8700226771e+07
DuPh2 3 -5.4798309406e+07
DuPh2 4 -5.1449702811e+07
DuPh2 5 -4.5085582062e+07
DuPh2 6 -4.3351304830e+07
DuPh2 7 -3.9409862782e+07
DuPh2 8 -3.6631797483e+07
DuPh2 9 -3.3317519821e+07
DuPh2 10 -3.0530258413e+07
DuPh2 11 -3.0087393072e+07
DuPh2 12 -3.0087391906e+07
DuPh2 13 -2.9704527676e+07
DuPh2 14 -2.8421173722e+07
DuPh2 15 -2.8421170942e+07
DuPh2 16 -2.8421162628e+07
DuPh2 17 -2.7592579526e+07
DuPh2 18 -2.2446835277e+07
DuPh2 19 -2.2446824310e+07
DuPh2 20 -2.2446823837e+07
DuPh2 21 -2.2446823808e+07
DuPh2 22 -2.2446823148e+07
DuPh2 23 -2.2446822801e+07
DuPh2 24 -1.9396814345e+07
DuPh2 25 -1.7078950457e+07
DuPh2 26 -1.7078950122e+07
DuPh2 27 -1.7078950076e+07
DuPh2 28 -1.7078950161e+07
DuPh2 29 -1.7078949906e+07
DuPh2 30 -1.7078949798e+07
Iteration Objective Infeasibilities num(sum)
DuPh2 30 -1.7078949798e+07 Pr: 0(0); Du: 0(8.83463e-06) Possibly optimal
dual-phase-2-optimal
dual-cleanup-shift
Iteration Objective Infeasibilities num(sum)
DuPh2 30 -1.7078915498e+07 Pr: 0(0); Du: 2(2.95116e-05) Perturbation cleanup
HEkkDual:: Using primal simplex to try to clean up num / max / sum = 2 / 1.01069e-05 / 2.95116e-05 dual infeasibilities
Primal feasible and num / max / sum dual infeasibilities of 2 / 1.01069e-05 / 2.95116e-05, so near-optimal
Near-optimal, so don't use bound perturbation
primal-phase2-start
Iteration Objective Infeasibilities num(sum)
PrPh2 30 -1.7078915498e+07 Pr: 0(0); Du: 2(2.95116e-05) No reason
Iteration Objective
PrPh2 31 -1.7078915498e+07
Iteration Objective Infeasibilities num(sum)
PrPh2 31 -1.7078915498e+07 Pr: 0(0); Du: 0(1.4097e-05) Possibly optimal
primal-phase-2-optimal
problem-optimal
Simplex iterations: DuPh1 1; DuPh2 29; PrPh2 1; Total 31
EKK dual simplex solver returns 0 primal and 0 dual infeasibilities: Status Optimal
Have num/max/sum primal (0/0/0) and dual (1/0.00115163/0.00115178) unscaled infeasibilities
Using EKK primal simplex solver
primal-phase2-start
Value( 27) = -0.5906 exceeds lower = -3.437e-07 by 0.5906, so shift bound by 0.5906 to -0.5906: infeasibility -1.687e-05 with error 4.91042e-17
Value( 1) = -0.3787 exceeds lower = -4.538e-07 by 0.3787, so shift bound by 0.3787 to -0.3787: infeasibility -1.908e-05 with error 1.31324e-17
Value( 22) = 1.226e+06 exceeds upper = 1.226e+06 by 0.3125, so shift bound by 0.3125 to 1.226e+06: infeasibility -1.026e-05 with error 3.75305e-12
phase2CorrectPrimal: num / max / sum primal corrections = 3 / 0.590639 / 1.28185
Iteration Objective Infeasibilities num(sum)
PrPh2 31 -1.7078916544e+07 Pr: 0(0); Du: 1(0.00115178) No reason
Iteration Objective
PrPh2 32 -1.7078916559e+07
Iteration Objective Infeasibilities num(sum)
PrPh2 32 -1.7078916559e+07 Pr: 0(0); Du: 0(3.25726e-07) Possibly optimal
primal-phase-2-optimal
primal-cleanup-shift
Iteration Objective Infeasibilities num(sum)
PrPh2 32 -1.7078915513e+07 Pr: 0(0); Du: 0(3.25726e-07) Perturbation cleanup
problem-optimal
Simplex iterations: PrPh2 1; Total 1
EKK primal simplex solver returns 0 primal and 0 dual infeasibilities: Status Optimal
Postsolve finished
Solving the original LP from the solution after postsolve
Using EKK primal simplex solver
primal-phase2-start
Value( 64) = -4.301e+04 exceeds lower = -4.301e+04 by 0.008659, so shift bound by 0.008674 to -4.301e+04: infeasibility -1.442e-05 with error 2.79149e-12
Value( 38) = -0.002573 exceeds lower = -7.635e-08 by 0.002573, so shift bound by 0.002585 to -0.002585: infeasibility -1.153e-05 with error 1.01644e-19
Value( 1) = -0.005225 exceeds lower = -6.322e-08 by 0.005225, so shift bound by 0.005236 to -0.005236: infeasibility -1.126e-05 with error 1.66018e-19
Value( 33) = 1.916e+04 exceeds upper = 1.916e+04 by 0.0008196, so shift bound by 0.0008384 to 1.916e+04: infeasibility -1.882e-05 with error 8.93627e-13
phase2CorrectPrimal: num / max / sum primal corrections = 4 / 0.00867361 / 0.0173325
Iteration Objective Infeasibilities num(sum)
PrPh2 32 -1.7078917684e+07 Pr: 0(0); Du: 8(695.04) No reason
Value( 14) = -0.005774 exceeds lower = -4.631e-07 by 0.005773, so shift bound by 0.005792 to -0.005793: infeasibility -1.926e-05 with error 1.32137e-19
Iteration Objective
PrPh2 33 -1.7078916369e+07
Value( 15) = -0.0007562 exceeds lower = -2.854e-07 by 0.0007559, so shift bound by 0.0007716 to -0.0007719: infeasibility -1.571e-05 with error 1.35525e-20
PrPh2 34 -1.7078916196e+07
PrPh2 35 -1.7078917268e+07
PrPh2 36 -1.7078917300e+07
PrPh2 37 -1.7078917325e+07
Iteration Objective Infeasibilities num(sum)
PrPh2 37 -1.7078917325e+07 Pr: 0(0); Du: 0(3.05052e-05) Possibly optimal
primal-phase-2-optimal
primal-cleanup-shift
Iteration Objective Infeasibilities num(sum)
PrPh2 37 -1.7078915571e+07 Pr: 0(0); Du: 0(3.05052e-05) Perturbation cleanup
problem-optimal
Simplex iterations: PrPh2 5; Total 5
EKK primal simplex solver returns 0 primal and 0 dual infeasibilities: Status Optimal
Have num/max/sum primal (0/0/0) and dual (2/0.000822368/0.00164502) unscaled infeasibilities
Using EKK primal simplex solver
Primal feasible and num / max / sum dual infeasibilities of 2 / 0.000822368 / 0.00164502, so near-optimal
Near-optimal, so don't use bound perturbation
primal-phase2-start
Iteration Objective Infeasibilities num(sum)
PrPh2 37 -1.7078915571e+07 Pr: 0(0); Du: 2(0.00164502) No reason
Iteration Objective
PrPh2 38 -1.7078915571e+07
Iteration Objective Infeasibilities num(sum)
PrPh2 38 -1.7078915571e+07 Pr: 0(0); Du: 0(4.32342e-07) Possibly optimal
primal-phase-2-optimal
problem-optimal
Simplex iterations: PrPh2 1; Total 1
EKK primal simplex solver returns 0 primal and 0 dual infeasibilities: Status Optimal
Postsolve : 6
Time : 90.00
Time Pre : 0.00
Time PreLP : 89.99
Time PostLP: 0.00
For LP LP-problem: Presolve 0.00 ( 0%): Solve presolved LP 89.99 ( 99%): Solve original LP 0.00 ( 0%)
Model status : Optimal
Simplex iterations: 38
Objective value : -1.7078915571e+07
HiGHS run time : 90.00
Via file compare, the only difference with the “fast” (debug) log is this:
Time : 0.45
Time Pre : 0.02
Time PreLP : 0.23
Time PostLP: 0.20
For LP LP-problem: Presolve 0.02 ( 3%): Solve presolved LP 0.23 ( 49%): Solve original LP 0.20 ( 45%)
Model status : Optimal
Simplex iterations: 38
Objective value : -1.7078915571e+07
HiGHS run time : 0.47
I'm quite puzzled by this, and would greatly appreciate any feedback.
I've downloaded your example, and it solves in no time from the command line - which is what you would also expect, given your comments. My only observation was that on a none-too-new PC, the run time was 0.01s, rather than your 0.45s
Looking at the output for your "rogue" run, I observe that the 32 iterations of parallel dual simplex required to solve the presolved LP take 89.99s, but the 6 iterations of serial primal simplex required to solve the original LP from the optimal solution of the presolved LP take "0.00" seconds. This makes me think that there's an issue with multithreading.
Running the dual simplex solver in parallel - particularly for a problem of this size - is of questionable value so try omitting parallel=on option (however you set it).
Of course the "debug" run also uses parallel dual simplex. When you say "debug", I guess you're compiling with the debugging flag (by setting cmake -DCMAKE_BUILD_TUPE=debug) otherwise I don't see the difference in the "debug" run
Unfortunately, if this rogue behaviour is due to calling from C# then we can't help. The HiGHS C# API was written by someone else, and I don't even know how to run HiGHS via C#
@jajhall thank you for your response and investigation. Turning parallel off doesn't help. Reverting to 1.5.3 doesn't help either. I agree that it looks like an environment issue, but I can't figure out what.
By debug I mean running from our webapplication vs running from C# code in the Visual Studio (so it's the same HiGHS DLL, not a different “debug” version). Which shouldn't have much difference either (I compare the .mps we generate etc), but obviously something is different.
Thanks again for trying to help, this is something on my side, and I don't even know how to keep looking. I doubt you'll be able to help further at this point, so you can close the issue.
Thanks. Sorry I can't help, but it does sound like something at your end.
@jajhall hello again, I'm not sure if this qualifies as a new issue, plus I'm very aware that this is likely something with how the library is called and multithreading, but I hope for some advice. I've ran accross another example where HiGHS exe solves a problem in seconds, but calling from C# seems to take forever (I simply stopped waiting).
Yet, disabling parallel works, and I get a quick solution. This is something I'll have to do for now, even though for our larger problems parallel was showing a significant improvement, and was one of the reasons we went with HiGHS.
I'm not uploading .mps yet (should I?), but perhaps you'll be able to recommend what to investigate in this situation. Here's the log for the “endless” case:
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
145 rows, 107305 cols, 745933 nonzeros
145 rows, 107214 cols, 745842 nonzeros
Presolve : Reductions: rows 145(-117); columns 107214(-99); elements 745842(-216)
Solving the presolved LP
Using EKK parallel dual simplex solver - PAMI with concurrency of 8
Iteration Objective Infeasibilities num(sum)
0 0.0000000000e+00 Ph1: 0(0) 0s
321 -9.8922096347e+06 Pr: 0(0); Du: 0(9.7114e-09) 2s
321 -9.8922096347e+06 Pr: 0(0); Du: 0(9.7114e-09) 2s
Using EKK parallel dual simplex solver - PAMI with concurrency of 8
Iteration Objective Infeasibilities num(sum)
321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58177e-11) 2s
WARNING: Increasing Markowitz threshold to 0.5
321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58191e-11) 7s
321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58191e-11) 12s
...
…Then 321 repeats for as long as I waited (over an hour). Does this repeating iteration number mean something? Is the warning significant? And why is the “Using EKK parallel dual simplex solver - PAMI with concurrency of 8” stated twice in the log?
Here's the successful log, with parallel off, quite a contrast (although “Using EKK dual simplex solver - serial” line is still repeated):
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
145 rows, 107305 cols, 745933 nonzeros
145 rows, 107214 cols, 745842 nonzeros
Presolve : Reductions: rows 145(-117); columns 107214(-99); elements 745842(-216)
Solving the presolved LP
Using EKK dual simplex solver - serial
Iteration Objective Infeasibilities num(sum)
0 0.0000000000e+00 Ph1: 0(0) 0s
350 -9.8922096347e+06 Pr: 0(0); Du: 0(7.6456e-09) 2s
350 -9.8922096347e+06 Pr: 0(0); Du: 0(7.6456e-09) 2s
Using EKK dual simplex solver - serial
Iteration Objective Infeasibilities num(sum)
350 -9.8922096347e+06 Pr: 9(0.0021); Du: 0(3.96555e-11) 2s
390 -9.8922096347e+06 Pr: 0(0); Du: 0(3.23401e-05) 2s
Solving the original LP from the solution after postsolve
Using EKK primal simplex solver
Iteration Objective Infeasibilities num(sum)
390 -9.8922096347e+06 Pr: 0(0); Du: 17(0.000294) 2s
392 -9.8922096347e+06 Pr: 0(0); Du: 0(2.94003e-05) 2s
Model status : Optimal
Simplex iterations: 392
Objective value : -9.8922096347e+06
HiGHS run time : 2.47
Again, I understand that this is not a HiGHS issue in itself, but looking at the log differences I wonder if you could make some experienced observations. Thank you!
PS (sorry if this is too much information!) With log_dev_level=3, I see a different log pattern starting around iteration 315… and notice how iteration 321 gets into endless loop to 329 and back:
DuPh2 315 -9.8922544906e+06
Iteration Objective Infeasibilities num(sum)
DuPh2 315 -9.8922544942e+06 Pr: 0(0); Du: 0(9.59158e-06) Synthetic clock
dual-phase-2-optimal
dual-cleanup-shift
Iteration Objective Infeasibilities num(sum)
DuPh2 315 -9.8922094863e+06 Pr: 0(0); Du: 11(0.00526849) Perturbation cleanup
HEkkDual:: Using primal simplex to try to clean up num / max / sum = 11 / 0.00150528 / 0.00526849 dual infeasibilities
primal-phase2-start
Iteration Objective Infeasibilities num(sum)
PrPh2 315 -9.8922094863e+06 Pr: 0(0); Du: 11(0.00526849) No reason
Iteration Objective
PrPh2 316 -9.8922094863e+06
PrPh2 317 -9.8922095206e+06
PrPh2 318 -9.8922095231e+06
PrPh2 319 -9.8922095921e+06
PrPh2 320 -9.8922095921e+06
PrPh2 321 -9.8922096347e+06
Iteration Objective Infeasibilities num(sum)
PrPh2 321 -9.8922096347e+06 Pr: 0(0); Du: 0(9.7114e-09) Possibly optimal
primal-phase-2-optimal
problem-optimal
Simplex iterations: DuPh2 315; PrPh2 6; Total 321
EKK dual simplex solver returns 0 primal and 0 dual infeasibilities: Status Optimal
Have num/max/sum primal (11/0.000985222/0.00263214) and dual (0/6.62359e-12/4.58177e-11) unscaled infeasibilities
Using EKK parallel dual simplex solver - PAMI with concurrency of 8
Dual feasible with unperturbed costs and num / max / sum primal infeasibilities of 11 / 0.000985222 / 0.00263214, so near-optimal
Near-optimal, so don't use cost perturbation
dual-phase-2-start
Iteration Objective Infeasibilities num(sum)
DuPh2 321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58177e-11) No reason
Iteration Objective
DuPh2 322 -9.8922096347e+06
DuPh2 323 -9.8922096347e+06
DuPh2 324 -9.8922096347e+06
DuPh2 325 -9.8922096347e+06
DuPh2 326 -9.8922096347e+06
DuPh2 327 -9.8922096347e+06
DuPh2 328 -9.8922096347e+06
Iteration Objective Infeasibilities num(sum)
DuPh2 321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58191e-11) Possibly singular basis
DuPh2 322 -9.8922096347e+06
DuPh2 323 -9.8922096347e+06
DuPh2 324 -9.8922096347e+06
DuPh2 325 -9.8922096347e+06
DuPh2 326 -9.8922096347e+06
DuPh2 327 -9.8922096347e+06
DuPh2 328 -9.8922096347e+06
DuPh2 329 -9.8922096347e+06
WARNING: Increasing Markowitz threshold to 0.5
Iteration Objective Infeasibilities num(sum)
DuPh2 321 -9.8922096347e+06 Pr: 11(0.00263214); Du: 0(4.58191e-11) Possibly singular basis
DuPh2 322 -9.8922096347e+06
...
(and so on and on and on)
Would you mind sharing the MPS file, particularly if you observe this issue when you debug in VSCode? I am wondering if I can replicate the hanging behaviour if I try to solve the MPS file from our C++ generated executable directly. The occasional hanging may be related to the threads bug on Windows, which is now observed both from Julia and Python and now C#. It must be something in the C++ code underneath the wrappers and it would be very helpful if I can replicate this
@galabovaa thank you for looking into this! Here's the mps file that caused this issue back in January: LP-problem.zip
Please note that since then we've changed our logic to calling highs.exe instead of calling the DLL from C#, as the DLL proved to be too unreliable for production usage. The .exe has its issues too, as you noted in the other ticket.