HiGHS icon indicating copy to clipboard operation
HiGHS copied to clipboard

Enhance getSolution(): convert lists to NumPy arrays

Open ParthPore10 opened this issue 2 months ago • 8 comments

This PR updates the Python interface of getSolution() to return NumPy arrays instead of native Python lists for the following attributes:

  1. col_value
  2. col_dual
  3. row_value
  4. row_dual

Returning NumPy arrays improves numerical performance and consistency, since most users already work with NumPy-based data when constructing models or performing post-processing. This change avoids redundant np.array() conversions in downstream applications and aligns input and output data types.

Implementation Details

Overrides getSolution() to convert solution lists into NumPy float arrays using _as_float_array(). Ensures compatibility by preserving the original structure of the solution object.

References Fixes #1590

ParthPore10 avatar Oct 25 '25 03:10 ParthPore10

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 81.06%. Comparing base (3dab010) to head (d87bb3f). :warning: Report is 67 commits behind head on latest.

Additional details and impacted files
@@            Coverage Diff             @@
##           latest    #2603      +/-   ##
==========================================
- Coverage   81.06%   81.06%   -0.01%     
==========================================
  Files         347      347              
  Lines       85239    85219      -20     
==========================================
- Hits        69100    69082      -18     
+ Misses      16139    16137       -2     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar Oct 25 '25 07:10 codecov[bot]

What do you reckon @mathgeekcoder ?

jajhall avatar Oct 25 '25 08:10 jajhall

@jajhall: I wouldn't accept this change. I have thought about returning these as numpy arrays (instead of lists), but this is not the best way to do this - it will cause an unnecessary copy and breaks the typing information.

@ParthPore10: The code isn't bad, but could be better. Per the documentation, np.asarray(a) will copy the memory (unless a is already a numpy array of same type). Your check inside float_array is basically redundant, and since we have a list, it will perform a copy.

That said, if we modify highs_bindings.cpp similar to what I did for HighsLp::col_cost_, e.g.,:

  py::class_<HighsLp>(m, "HighsLp", py::module_local())
      .def(py::init<>())
      .def_property("col_cost_", make_readonly_ptr(&HighsLp::col_cost_),
                    make_setter_ptr(&HighsLp::col_cost_))

This should avoid unnecessary copies and return a numpy array on the python side. If this is changed, the corresponding types in _core/__init__.pyi needs to be updated too, e.g.,:

class HighsSolution:
    col_dual: list[float]  # list should be changed to numpy.ndarray[typing.Any, numpy.dtype[numpy.float64]]
    col_value: list[float]
    dual_valid: bool
    row_dual: list[float]
    row_value: list[float]
    value_valid: bool

    def __init__(self) -> None: ...

I'm hesitant to use numpy arrays everywhere (in some cases lists are actually faster), but for this it should be fine.

BTW: if you use the highs.vals function, it will already return a numpy array (or dictionary if appropriate). Although it requires a copy by design.

mathgeekcoder avatar Oct 27 '25 17:10 mathgeekcoder

@jajhall: I wouldn't accept this change. I have thought about returning these as numpy arrays (instead of lists), but this is not the best way to do this - it will cause an unnecessary copy and breaks the typing information.

Thanks, I guess that returning these as numpy arrays (instead of lists) means that we avoid users having to avoid this issue?

jajhall avatar Oct 27 '25 18:10 jajhall

Thanks, I guess that returning these as numpy arrays (instead of lists) means that we avoid users having to avoid this issue?

Great question! I didn't think it would help - but I gave it a quick test, and you are correct - that issue doesn't seem to occur with a numpy array from pybind (using my suggested change above). e.g.,:

lp = h.getLp()

# fast (uses numpy)
value = [lp.col_cost_[icol] for icol in range(num_vars)]

# slow (uses list)
value = [lp.col_lower_[icol] for icol in range(num_vars)]

mathgeekcoder avatar Oct 27 '25 20:10 mathgeekcoder

@ParthPore10 Please edit the PR as suggested by @mathgeekcoder

@ParthPore10: Since this is a larger change, and modifying the C++ code - please let me know if you'd prefer me to make these changes, or if you'd like to make an attempt.

mathgeekcoder avatar Oct 27 '25 20:10 mathgeekcoder

Hii @mathgeekcoder , @jajhall tthanks for pointing that out. I’ll review the problem and take another crack at it.

ParthPore10 avatar Oct 27 '25 20:10 ParthPore10

@ParthPore10: there's something odd about your PR. The code-diff looks like it's comparing to an older branch, e.g., it doesn't recognize the changes for col_cost_ have already been made.

Also, BTW: I wouldn't change the bazel build; there's already a setter function (so you should use that instead of writing your own); and you didn't actually change the getter/setter for HighsSolution in highs_bindings.cpp.

mathgeekcoder avatar Oct 29 '25 18:10 mathgeekcoder