ipet icon indicating copy to clipboard operation
ipet copied to clipboard

Parsing of Gurobi's primal bounds history incomplete/inaccurate

Open rschwarz opened this issue 1 year ago • 8 comments

I'm using ipet to extract the primal bound history from a Gurobi logfile, but only the first solution is stored, not the last found (improving) solution.

Here's an extract from the logfile:

Optimize a model with 1776 rows, 1464 columns and 5324 nonzeros
[...]
Variable types: 713 continuous, 141 integer (141 binary)
Found heuristic solution: objective 88297.179396
Found heuristic solution: objective 83210.981562

Root relaxation: objective 0.000000e+00, 858 iterations, 0.01 seconds (0.02 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

*    0     0               0       0.0000000    0.00000  0.00%     -    0s

Explored 1 nodes (858 simplex iterations) in 0.03 seconds (0.05 work units)
Thread count was 1 (of 16 available processors)

Solution count 3: 0 83211 88297.2 

Optimal solution found (tolerance 1.00e-04)
Best objective 0.000000000000e+00, best bound 0.000000000000e+00, gap 0.0000%

As I can see it, two solutions were found by heuristics after presolve but before root relaxation, of values 88297.179396 and 83210.981562.

But then, when the root node is solved, the relaxation finds the optimal solution of value 0 immediately (time = 0s).

So, I would have expected a primal bound of either just [(0.0, 0.0)] (taking into account only the table lines), or something like [(0.0, 88297.179396), (0.0, 83210.981562), (0.0, 0.0)].

Instead, the returned list is [(0.0, 83210.981562)], which is missing the solution in the root node.

rschwarz avatar Dec 16 '24 09:12 rschwarz

Looking at the method that does the parsing, I feel that self.gurobiextralist = [] happens too rarely.

The first lines that show the pre-root solutions add values to this "extra list", but don't have time information, so no datum is added to the primal bound history.

But then these values remain, as additional lines are parsed, with the effect that these early values will actually be added later, overwriting the better solution found at root.

https://github.com/GregorCH/ipet/blob/dbdb0f56ee0d6ac1df4fa203bd6da5fd70945d08/ipet/parsing/Solver.py#L656-L678

rschwarz avatar Dec 16 '24 09:12 rschwarz

What solved the issue for me (in this example) is swapping the order of the two relevant cases, ie,

         elif self.inTable and line.startswith("H") or line.startswith("*"): 
             self.readBoundAndTime(line, -5, -1, timestripchars = "s") 
         elif self.inTable and line.endswith("s\n") and self.gurobiextralist != []: 
             pointInTime = line.split()[-1].strip("s") 
             self.addHistoryData(Key.PrimalBoundHistory, pointInTime, self.gurobiextralist[-1]) 
             self.gurobiextralist = [] 

This only keeps the relaxation solution, but it's the best at this point in time, so I think it should be OK.

rschwarz avatar Dec 16 '24 10:12 rschwarz

A further problem is caused by the addition of work units to the log output of Gurobi.

In particular, consider the following line from a log:

Explored 0 nodes (0 simplex iterations) in 0.02 seconds (0.03 work units)

For solutions from pre-root heuristics, the parsing methods wants to extract the 0.02 as a time measurement, but the current code looks just at the 2nd last time (which is now work). https://github.com/GregorCH/ipet/blob/dbdb0f56ee0d6ac1df4fa203bd6da5fd70945d08/ipet/parsing/Solver.py#L672-L675

The following snipped worked for me:

        elif self.gurobiextralist != [] and "Explored " in line:
            items = line.split()
            seconds_idx = items.index("seconds")
            pointInTime = line.split()[seconds_idx - 1]  # one before
            self.addHistoryData(
                Key.PrimalBoundHistory, pointInTime, self.gurobiextralist[-1]
            )
            self.gurobiextralist = []

rschwarz avatar Dec 16 '24 11:12 rschwarz

And yet another case, that is not properly handled: New solutions found during "clean up", when the time limit is almost reached. See for example this snippet from a log:

   400   274  postponed  101       318.92068    1.00000   100%  4822  578s
   401   274  postponed  101       318.92068    1.00000   100%  4974  591s

Cutting planes:
  Learned: 1
Warning: uncrushing solution improved objective value from 318.9206756 to 318.8422373

Cleanup yields a better solution

Explored 402 nodes (2016138 simplex iterations) in 600.00 seconds (785.79 work units)
Thread count was 1 (of 16 available processors)

Solution count 3: 119.89 318.921 470.921 

Time limit reached
Best objective 1.198900905095e+02, best bound 1.000000000000e+00, gap 99.1659%

The last reported primal bound is the 318.9... from the table lines, even though a better solution is found with value 119.89, apparently.

I guess this will be difficult to parse, though...

EDIT: The following snipped worked for me:

        # Account for improvements found during cleanup (at the end)
        if "Cleanup yields a better solution" in line:
            self.cleanup = True
        elif self.cleanup:
            if "Explored " in line:
                items = line.split()
                seconds_idx = items.index("seconds")
                pointInTime = line.split()[seconds_idx - 1]  # one before
                self.cleanup_time = pointInTime
            elif "Best objective" in line:
                value = line.split()[2].rstrip(",")
                assert self.cleanup_time is not None
                self.addHistoryData(Key.PrimalBoundHistory, self.cleanup_time, value)

rschwarz avatar Dec 16 '24 11:12 rschwarz

Thank you for raising these issues and providing suggestions already. In order to resolve the errors permanently, I would like to add new Gurobi logs to the unit tests. As you can imagine, I currently don't have direct access to recent Gurobi logs using the newest version, after they decided to no longer appear in the public benchmarks. I will see what I can do.

GregorCH avatar Dec 19 '24 09:12 GregorCH

Yeah, no problem. I can share some of our logfiles, if that helps.

I would have created a PR as well, but I had trouble installing the overall package with the dependencies etc, so I just copied files out of the parsing part.

UPDATE: I've shared two logfiles that demonstrate the issues above in a Gist. I've also included my updated version of GurobiSolver from parsing.py.

Thanks for providing this tool, it was of great help for my analysis of early primal bounds :heart:

rschwarz avatar Dec 19 '24 09:12 rschwarz

There's also an official log file parser from Gurobi here: https://github.com/Gurobi/gurobi-logtools/

But I couldn't quickly see whether it supports the (primal) bounds history.

rschwarz avatar Dec 19 '24 09:12 rschwarz

Thanks for posting this here @rschwarz . gurobi-logtools supports parsing the primal and dual bounds from the logs. The data can be found in the "Incumbent" and "BestBd" columns of the nodelog progress table. The output is a pandas DataFrame, and you can filter and plot the data as shown below:

import gurobi_logtools as glt
results = glt.parse(["defaults1200/*log"])
nodelog_progress = results.progress("nodelog")
glt.plot(nodelog_progress, y="Incumbent", color="Model", type="line")

lennsea avatar Jan 07 '25 13:01 lennsea