HiGHS
HiGHS copied to clipboard
Overhead using highspy
I'm just calling your attention to a potential pitfall for users of highspy, which is that accessing solution.col_value[icol]
is a quite slow operation, I guess because of some overhead in pybind11. Doing over 100k independent accesses to retrieve a solution takes a few minutes, but can be reduced to milliseconds by first converting to a Python list and accessing that instead.
I made the relevant change in PuLP recently since whoever wrote the interface had made that mistake which was obviously a massive and avoidable overhead. https://github.com/coin-or/pulp/pull/641
Perhaps it should remain the responsibility of consumers of highspy to be aware of and manage, but you could consider implementing something like a solution.col_values()
function that returns a python list of values. Wouldn't even need to be cached as that introduces a risk of invalidation. I'd be happy to implement it and open a PR, but I wasn't sure what your policy would be on unsolicited feature PRs(?). In general I hope to be a contributor to HiGHS in the near future 🙂.
I would love to use highspy in favor of the open source solver we use now, but solution.col_value[icol]
is way too slow for me. I found that the access time of solution.col_value[icol]
scales with the number of variables in the program. I have in the order of 1^10
variables in my program which I can build and solve reasonably fast in highspy, but accessing solution.col_value[icol]
takes around 0.16s (per variable!). It would take many days for me to retrieve all variable values. I hope this issue will get some attention!
@jajhall Thanks for the tip regarding the Python list. I will try that out if it can temporarily solve my performance issues.
This is very odd. Addressing this is on our ToDo list, and we've mentioned it in discussions, but that's as far as it's got. I Your comment raises the priority!
So, I've reproduced the behaviour for highspy_speed_test.py and now see what "converting to a Python list" means.
Maybe this is a known pybind11
issue, for which there is a solution - perhaps to be implemented where py::class_<HighsSolution>(m, "HighsSolution")
is defined in highspy/highs_bindings.cpp
, but I'm loathed to invest time in it myself since there is a solution that can be implemented in the Python call, that can be flagged up via a warning in our documentation.
The original comment in this issues is one that I pasted from an email that I no longer have. Could it have come from @aphi?
Yes that comment was from an email I sent and referring to the change we made in PuLP to avoid this overhead. Hope the solution works for you @matt035343 .
@aphi It did work, thanks. I now use it as a temporary workaround. For anyone experiencing this issue, you can use following snippet:
values = list(solution.col_value)
values[icol]
However, I do still believe it should be fixed permanently somewhere upstream
Anyone have an idea how to fix it upstream?
I'd have thought it was a known pybind11 issue
I've submitted a query to pybind11
Only solution at the moment is a warning in the documentation