MathOptInterface.jl
MathOptInterface.jl copied to clipboard
Improve performance of String names
A common source of performance problems in JuMP and MathOptInterface are dealing with the String
names of variables and constraints.
As a recent example, see https://github.com/jump-dev/MathOptInterface.jl/pull/2426 and related issues/PRs.
The main place that String
s are stored is:
https://github.com/jump-dev/MathOptInterface.jl/blob/dcfb03311bb6afc10a36e7967cdd9aecb706241c/src/Utilities/model.jl#L27-L31
When building models with large (>10^5) numbers of variables and constraints, these dictionaries contain a large number of strings. Each String
is a GC-tracked object, and it is long-lived, because we maintain these dictionaries throughout the lifetime of a Model
. If CachingOptimizer
is involved, there can also be multiple copies of the strings, for example, if we have a setup like Cache(Model, Bridge(Cache(Model, Solver)))
.
Lots of long-lived GC-tracked objects are bad for Julia's garbage collector.
Workaround
The main workaround at present is JuMP.set_string_names_on_creation(model, false)
, which does not pass string names from JuMP to the solver.
The downside of this workaround is that solvers and file writers cannot print or display useful names to the user.
- https://github.com/jump-dev/JuMP.jl/issues/2973
- https://github.com/jump-dev/JuMP.jl/issues/2817
Potential approaches
We should try replacing the String
storage with an alternative string type. Candidates are:
- https://github.com/JuliaStrings/InlineStrings.jl
- https://github.com/JuliaData/WeakRefStrings.jl
To ensure backwards compatibility, we should make sure that getting and setting these strings uses the original String
type, so that the internal storage is hidden from the user.
It is okay to convert between string types, and to generate extra allocations if these new strings are short-lived objects. The real performance problem is long-lived GC-tracked objects.
Solvers
Many solver packages also store String
names. If the changes to MOI are an improvement, we should investigate was to simplify name handling in the different solver wrappers.