HiGHS icon indicating copy to clipboard operation
HiGHS copied to clipboard

BUG: Gracefully fail for OOM (Out of Memory) in Presolve

Open HaoZeke opened this issue 2 years ago • 3 comments

As noted in the title. Essentially this is a fix for https://github.com/scipy/scipy/issues/15888.

Although this does introduce HighsExceptions it requires no changes to the C API or any other part of the code, this is just to propagate the error internally, and it maps (within the C++ part itself) to existing error codes, raising Infeasible.

Some more context. For an unreasonably large problem (which none the less has a trivial solution which can be found without Presolve):

from scipy.optimize import linprog
import numpy as np
np.random.seed(0)
n_ctr = 500000
n_var = 500
c_ = np.ones(n_var)
A_ub = np.random.random((n_ctr, n_var))
b_ub = np.zeros((n_ctr, ))

res = linprog(c_, A_ub=A_ub, b_ub=b_ub, method='highs-ds', bounds=(0, None), options={'disp': True})

print('End')

This previously triggered the out of memory killer and crashes. Now it will instead report:

Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Presolving model
ERROR:   Problem too large for Presolve, try without
Specific error raised:
Failed to resize the vector. Requested new size: -2142950640. Size to add is 24. Current size: 2152016632.
Problem status detected on presolve: Infeasible
Model   status      : Infeasible
Objective value     :  0.0000000000e+00
HiGHS run time      :         29.61

This will also save similar crashes across the rest of highs, I used the python example because that was where it was reported first.

FWIW (maybe something for the Highs team) this problem has a trivial solution:

from scipy.optimize import linprog
import numpy as np
np.random.seed(0)
n_ctr = 500000
n_var = 500
c_ = np.ones(n_var)
A_ub = np.random.random((n_ctr, n_var))
b_ub = np.zeros((n_ctr, ))

res = linprog(c_, A_ub=A_ub, b_ub=b_ub, method='highs-ds', bounds=(0, None), options={'disp': True, 'presolve': False})
Solving LP without presolve or with basis
Model   status      : Optimal
Objective value     :  0.0000000000e+00
HiGHS run time      :         25.84

So my understanding is that this should somehow trigger HighsPresolveStatus::kReducedToEmpty and then bailout (special cased) instead of entering the PostSolve phase (the length error comes from the PostSolveStack). Currently the code treats kReducedToEmpty and kReduced the same way:

// Highs.cpp ~ line 1398
      if (model_presolve_status_ == HighsPresolveStatus::kReduced ||
          model_presolve_status_ == HighsPresolveStatus::kReducedToEmpty) {

However, I think that can be dealt with later (if at all), as that would be an improvement (faster runtime than the current approach), while this PR fixes a bug (the crashing of Highs).

HaoZeke avatar Oct 21 '23 22:10 HaoZeke

Thanks, but there are a couple of issues here.

Firstly, return Result::kPrimalInfeasible; cannot be returned, as infeasibility hasn't been determined. The postsolve_stack data should be cleared, and presolve should return as if no reductions can be performed.

Is there an existing Result:: which should be raised instead? With kStopped it will seg-fault afterwards:

Presolving model
ERROR:   Problem too large for Presolve, try without
Specific error raised:
Failed to resize the vector. Requested new size: -2142950640. Size to add is 24. Current size: 2152016632.
presolve_.run() returns status: Reduced
Presolve : Reductions: rows 499999(-1); columns 231(-269); elements 115500000(-134500000)
Solving the presolved LP

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
test_oomkill is a Catch v2.13.8 host application.
Run with -? for options

-------------------------------------------------------------------------------
linprog_oom
-------------------------------------------------------------------------------
../check/TestOOMKill.cpp:11
...............................................................................

../check/TestOOMKill.cpp:11: FAILED:
  {Unknown expression after the reported line}
due to a fatal error condition:
  SIGSEGV - Segmentation violation signal

Secondly, I'm unclear why you're thinking it should somehow trigger HighsPresolveStatus::kReducedToEmpty. It's not reduced to empty if presolve fails. For me, it should trigger HighsPresolveStatus::kNotReduced, in which case there's no call to postsolve, even if the postsolve stack isn't cleared.

It's correct to treat kReducedToEmpty and kReduced the same way. If the model is reduced to empty, then solution_ and basis_ are trivially empty, so postsolve builds on this. If presolve yields kReduced (ie reductions identified, but to a non-empty LP), then the LP solver applied to the reduced LP yields a non-empty solution_ and basis_.

Makes sense. This was a misunderstanding on my part, on a computer with more memory and with 64-bit integers, the presolve phase reduced to empty, so I thought there might be an optimized path for it, but that would be incorrect here.

I'm happy to merge this into a development branch for me to fix the issues above, before merging to latest.

That'd be great, I've pushed an test in the C++ code-base as well. To run it:

meson setup bbdir --buildtype="debug" -Dwith_tests=True -Dhighsint64=False
meson compile -C bbdir
./bbdir/check/test_oomkill
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Running with 6 thread(s)
Presolving model
ERROR:   Problem too large for Presolve, try without
Specific error raised:
Failed to resize the vector. Requested new size: -2142950640. Size to add is 24. Current size: 2152016632.
presolve_.run() returns status: Infeasible
Problem status detected on presolve: Infeasible
Model   status      : Infeasible
Objective value     :  0.0000000000e+00
HiGHS run time      :         24.44
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Running with 6 thread(s)
Solving LP without presolve or with basis
Postsolve  : 
Time       :    19.49
Time Pre   :    -1.00
Time PreLP :    -1.00
Time PostLP:    19.49
For LP                 : Solve original LP    19.49 ( 99%)
Model   status      : Optimal
Objective value     :  0.0000000000e+00
HiGHS run time      :         20.06
===============================================================================
All tests passed (6 assertions in 2 test cases)

It seems like there should be a new status flag which might be needed, perhaps something like: Result::kPresolveFail -> HighsPresolveStatus::kNotPresolved -> HighsModelStatus::kNotSet perhaps

HaoZeke avatar Oct 22 '23 14:10 HaoZeke

Thanks. Overall, this is another very useful contribution to HiGHS from you. I will look at this in a week's time. I'm hard pressed with work this week

jajhall avatar Oct 22 '23 14:10 jajhall

Thanks. Overall, this is another very useful contribution to HiGHS from you. I will look at this in a week's time. I'm hard pressed with work this week

Awesome, thanks so much. For now, I've added a new error code so it goes from Result::kError->HighsPresolveStatus::kError->HighsModelStatus::kPresolveError and added tests for this:

./bbdir/check/test_oomkill
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Running with 6 thread(s)
Presolving model
ERROR:   Problem too large for Presolve, try without
Specific error raised:
Failed to resize the vector. Requested new size: -2142950640. Size to add is 24. Current size: 2152016632.
presolve_.run() returns status: Presolve error
Presolve returned status 9
Model   status      : Presolve error
HiGHS run time      :         24.17
Running HiGHS 1.6.0: Copyright (c) 2023 HiGHS under MIT licence terms
Running with 6 thread(s)
Solving LP without presolve or with basis
Postsolve  : 
Time       :    16.43
Time Pre   :    -1.00
Time PreLP :    -1.00
Time PostLP:    16.43
For LP                 : Solve original LP    16.43 ( 99%)
Model   status      : Optimal
Objective value     :  0.0000000000e+00
HiGHS run time      :         17.02
===============================================================================
All tests passed (6 assertions in 2 test cases)

Feel free to make changes as needed next week :)

HaoZeke avatar Oct 22 '23 15:10 HaoZeke