HiGHS icon indicating copy to clipboard operation
HiGHS copied to clipboard

HiGHS crashes with assertion failure

Open zxt5 opened this issue 9 months ago • 11 comments

HiGHS (build from latest branch) crashes when solving seed.mps.txt with option file highs_options.txt

root@b9f8d0525121:/tmp# cat highs_options.txt
mip_rel_gap=0
mip_abs_gap=0
presolve=off
mip_lp_age_limit=19
mip_pool_age_limit=2
mip_heuristic_effort=0.321638279322984
root@b9f8d0525121:/tmp# highs --model_file seed.mps --options_file highs_options.txt
Running HiGHS 1.9.0 (git hash: acf51510d): Copyright (c) 2025 HiGHS under MIT licence terms
Set option presolve to "off"
Set option log_file to "HiGHS.log"
Set option mip_lp_age_limit to 19
Set option mip_pool_age_limit to 2
Set option mip_heuristic_effort to 0.321638
Set option mip_rel_gap to 0
Set option mip_abs_gap to 0
MIP  seed has 370 rows; 19 cols; 7029 nonzeros; 11 integer variables
Coefficient ranges:
  Matrix [1e-02, 2e+01]
  Cost   [5e-01, 1e+02]
  Bound  [2e+02, 2e+02]
  RHS    [5e-02, 1e+02]

Presolve is switched off

Solving MIP model with:
   370 rows
   19 cols (0 binary, 11 integer, 0 implied int., 8 continuous)
   7029 nonzeros

Src: B => Branching; C => Central rounding; F => Feasibility pump; H => Heuristic; L => Sub-MIP;
     P => Empty MIP; R => Randomized rounding; S => Solve LP; T => Evaluate node; U => Unbounded;
     z => Trivial zero; l => Trivial lower; u => Trivial upper; p => Trivial point; X => User solution

        Nodes      |    B&B Tree     |            Objective Bounds              |  Dynamic Constraints |       Work
Src  Proc. InQueue |  Leaves   Expl. | BestBound       BestSol              Gap |   Cuts   InLp Confl. | LpIters     Time

         0       0         0   0.00%   -187950         inf                  inf        0      0      0         0     0.0s
         0       0         0   0.00%   2478.046103     inf                  inf        0      0      0        62     0.0s
 C       0       0         0   0.00%   2508.606812     3024.888376       17.07%       29      5      0        75     0.3s
 T      16       0         5   3.71%   2513.575625     3010.091182       16.50%       27      4     27      1920     4.4s
 T      57       2        24   8.01%   2513.575625     2963.942415       15.19%       59      4     60      2352     5.2s
 T     151      12        64  10.96%   2513.711157     2961.310011       15.11%      100      4    112      2746     6.3s
 T     286      29       123  60.80%   2513.739403     2949.737102       14.78%      126      4    201      3312     8.1s
 T     351      19       155  61.24%   2513.739403     2891.23908        13.06%      156      4    244      3552     8.4s
 T     449      24       201  61.62%   2513.86308      2861.092199       12.14%      155      4    308      3914     9.3s
 T     474      25       207  61.65%   2513.86308      2809.125929       10.51%      139      4    319      4016     9.6s
 T     521      23       232  62.26%   2513.86308      2795.344773       10.07%      164      4    359      4216     9.9s
 T     654      25       296  69.89%   2513.86308      2776.400062        9.46%      144      4    500      4972    11.9s
highs: /tmp/HiGHS/src/mip/HighsSearch.cpp:301: HighsInt HighsSearch::selectBranchingCandidate(int64_t, double&, double&): Assertion `lower_residual > -1e-12 && upper_residual > -1e-12' failed.
Aborted (core dumped)

zxt5 avatar Mar 05 '25 20:03 zxt5

Fixed by #2203

jajhall avatar Mar 05 '25 21:03 jajhall

I am not sure, since the crash still happens on the latest branch

zxt5 avatar Mar 05 '25 21:03 zxt5

Not for me: have you pulled and rebuilt?

jajhall avatar Mar 05 '25 21:03 jajhall

yes I pulled and rebuilt. let me double check.

zxt5 avatar Mar 05 '25 21:03 zxt5

Ah, you used presolve off

jajhall avatar Mar 05 '25 21:03 jajhall

Yes, presolve=off in the option file. I just double-checked, and I can still produce the bug on version acf51510d

zxt5 avatar Mar 05 '25 21:03 zxt5

It needs more than presolve=off! I see all your option settings now, and reproduce the failure.

jajhall avatar Mar 05 '25 21:03 jajhall

I assume that you're tuning HiGHS, and this is why you're creating so many instances!

jajhall avatar Mar 05 '25 21:03 jajhall

Ah I am actually doing fuzzing for mip solvers, not only highs, but also scip, CBC, etc.

zxt5 avatar Mar 05 '25 21:03 zxt5

Thanks for examining these issues!

zxt5 avatar Mar 05 '25 21:03 zxt5

I looked into this a bit today because I've been getting familiar with HighsSearch and the bug is still appearing on latest. Leaving this here so I can come back to it later or someone else feels like picking it up.

Summary of bug: At some LP (deep inside of the RINS heuristic) the lb of column 4 is moved from 0 to 1 (ub is 2). Before and after solving the LP, there exists a substitution from a clique that says col4 = 1 - col6. col6 has some fractional value 0.05 or so, and HiGHS overwrites the returned value of 1 for col4 with 1 - 0.05 (strictly for the purpose of storing fractional integers for branching and for attempting trivial heuristics. NOT overwriting the actual value in the LP). This later raises an error with the branching score calculation, because 0.95 is outside of the col4's domain.

Major confusion: Is it actually safe to just change these values based on the substitutions? Minor confusion: Not yet familiar with the clique table code or to the full extent fractionalints is used outside of branching. Either a basic if statement check for the substitution needs to be added or the substitution is not being cleaned up correctly following the domain change.

Opt-Mucca avatar May 08 '25 13:05 Opt-Mucca

I looked into this! Think I've actually located the issue too!

In a sub-milp HiGHS is managing to tighten the global bounds of x4 from a dual proof. That is, given x4 + x6 <= 1, where 0 <= x6 <= 1 & 0 <= x4 <= 2 it can conclude that x4 <= 1. The problem is then that this bound update is only given to the globaldomain and NOT to the domain attached to the HighsSearch object. This leads to a problem later when HiGHS performs a substitution it believes is globally valid but that does not actually respect the local domain of the search. The two possible solutions:

  • Add a check for a valid domain when the substitution is happening. Skip the substitution if the domain isn't satisfied. This should have essentially no overhead and will be safe.
  • Add a communication step to push any global bound changes to the search domain after the heuristics are called in the search. This would fix the issue here and maybe even improve performance slightly for the search, but doesn't fix the fundamental problem of two potentially different domains communicating at different points during the solve process.
  • Both could be added?

Need to think on this a bit.

Opt-Mucca avatar Jul 09 '25 16:07 Opt-Mucca

Closed by #2462

jajhall avatar Jul 10 '25 12:07 jajhall