JuMP.jl
JuMP.jl copied to clipboard
Constraints with few terms are (relatively) slow
Example from discourse: https://discourse.julialang.org/t/optimal-model-creation-and-garbage-collection/72619
julia> using JuMP, Gurobi
julia> function create_model(I, T = 10000)
model = Model(Gurobi.Optimizer; add_bridges = false)
set_time_limit_sec(model, 0.001)
@variable(model, 0 <= x[1:I, 1:T] <= 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
@objective(model, Min, sum(x))
return model
end
create_model (generic function with 2 methods)
julia> solve_model!(model) = optimize!(model)
solve_model! (generic function with 1 method)
julia> i = 1000;
julia> @time model = create_model(i);
Academic license - for non-commercial use only - expires 2022-01-06
369.139157 seconds (362.98 M allocations: 24.727 GiB, 91.34% gc time)
91% of the time being spent in GC isn't ideal! We should find there this garbage is coming from.
I'm being a choosy beggar but I'd be curious to see any summaries/charts of where the time/allocations are spent
A lot of it is creating 20,000,000 ordered dictionaries for the AffExpr in the constraints. I've been going through the MOI side first. Not sure what parts we could optimize in JuMP without adding extra complexity.
OK thanks! I guess I should keep this in mind because of my initial choice to use JuMP instead of MOI directly in MOIPajarito.
The ordered dictionary is a big contributor we can't do much about. One long-term option would be to go the Gravity route with constraint templates.
But there are a few other contributors as well:
julia> function variables(I, T)
model = Model()
@variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
return model
end
variables (generic function with 1 method)
julia> @benchmark variables(100, 1000)
BenchmarkTools.Trial: 302 samples with 1 evaluation.
Range (min … max): 13.886 ms … 22.610 ms ┊ GC (min … max): 0.00% … 24.44%
Time (median): 15.300 ms ┊ GC (median): 0.00%
Time (mean ± σ): 16.549 ms ± 2.364 ms ┊ GC (mean ± σ): 8.61% ± 11.17%
▆█▅█▁▁▃
▃▆▄▆▆████████▅▅▄▄▃▃▂▃▁▃▁▁▁▁▁▁▁▁▁▁▁▃▃▂▃▃▃▆▄▃▃▆▅▅▇▄▄▆▄▄▂▂▂▂▂▂ ▄
13.9 ms Histogram: frequency by time 21.8 ms <
Memory estimate: 16.72 MiB, allocs estimate: 200142.
julia> @benchmark variables(1000, 10000)
BenchmarkTools.Trial: 3 samples with 1 evaluation.
Range (min … max): 1.822 s … 2.065 s ┊ GC (min … max): 15.53% … 23.95%
Time (median): 2.018 s ┊ GC (median): 21.69%
Time (mean ± σ): 1.968 s ± 129.183 ms ┊ GC (mean ± σ): 20.58% ± 4.36%
█ █ █
█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁█ ▁
1.82 s Histogram: frequency by time 2.07 s <
Memory estimate: 1.27 GiB, allocs estimate: 20000163.
julia> function variables_with_names(I, T)
model = Model()
@variable(model, 0 <= x[1:I, 1:T] <= 100)
return model
end
variables_with_names (generic function with 1 method)
julia> @benchmark variables_with_names(100, 1000)
BenchmarkTools.Trial: 107 samples with 1 evaluation.
Range (min … max): 37.754 ms … 58.620 ms ┊ GC (min … max): 0.00% … 28.66%
Time (median): 48.214 ms ┊ GC (median): 17.45%
Time (mean ± σ): 46.918 ms ± 5.204 ms ┊ GC (mean ± σ): 14.53% ± 8.68%
▃ ▃▁▃█▁ ▆▃ ▃ ▃▃▁▃▁▃▃▁ ▃▆▁ ▁▁
█▁█████▇▁▄▁▁▄▁▁▁▄▁▁▁▁▇▁▇▇▁▇██▄█▄████████▁███▇██▄▁▁▄▁▁▄▄▁▁▁▄ ▄
37.8 ms Histogram: frequency by time 56.9 ms <
Memory estimate: 46.80 MiB, allocs estimate: 900174.
julia> @benchmark variables_with_names(1000, 10000)
BenchmarkTools.Trial: 1 sample with 1 evaluation.
Single result which took 9.680 s (41.27% GC) to evaluate,
with a memory estimate of 4.21 GiB, over 90000231 allocations.
So the names thing is pretty much just to cost creating a dictionary to store the names. We can't do much better:
function test_names(N)
x = Dict{Int,String}()
for i in 1:N
x[i] = string("x[", i, "]")
end
return x
end
function test_names_jump(N)
model = Model()
@variable(model, x[1:N])
return model
end
julia> @time test_names(100_000);
0.075508 seconds (599.52 k allocations: 31.601 MiB)
julia> @time test_names_jump(100_000);
0.080933 seconds (700.17 k allocations: 37.647 MiB)
julia> @time test_names(1_000_000);
1.598591 seconds (6.00 M allocations: 324.562 MiB, 40.13% gc time)
julia> @time test_names_jump(1_000_000);
1.917196 seconds (7.00 M allocations: 360.841 MiB, 40.85% gc time)
That means a small win for the OP is to use anonymous variables instead:
julia> using JuMP, BenchmarkTools, GLPK
julia> function create_model(I, T = 10000)
model = Model(GLPK.Optimizer)
@variable(model, 0 <= x[1:I, 1:T] <= 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
@objective(model, Min, sum(x))
return model
end
create_model (generic function with 2 methods)
julia> @benchmark create_model(100, 1000)
BenchmarkTools.Trial: 18 samples with 1 evaluation.
Range (min … max): 247.279 ms … 307.278 ms ┊ GC (min … max): 17.74% … 33.75%
Time (median): 282.037 ms ┊ GC (median): 27.44%
Time (mean ± σ): 280.924 ms ± 15.225 ms ┊ GC (mean ± σ): 27.40% ± 3.75%
▁ ▁ ▁ ▁ ▁ █ ▁█ ▁ ▁▁ ▁ ▁▁ █
█▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁█▁▁█▁▁▁▁█▁█▁▁▁▁▁██▁█▁██▁▁█▁██▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
247 ms Histogram: frequency by time 307 ms <
Memory estimate: 256.05 MiB, allocs estimate: 3598106.
julia> function create_model_2(I, T = 10000)
model = Model(GLPK.Optimizer)
x = @variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
@objective(model, Min, sum(x))
return model
end
create_model_2 (generic function with 2 methods)
julia> @benchmark create_model_2(100, 1000)
BenchmarkTools.Trial: 24 samples with 1 evaluation.
Range (min … max): 183.181 ms … 244.146 ms ┊ GC (min … max): 13.25% … 31.49%
Time (median): 214.783 ms ┊ GC (median): 24.05%
Time (mean ± σ): 214.715 ms ± 13.848 ms ┊ GC (mean ± σ): 24.19% ± 4.50%
▃▃█ █▃
▇▁▁▁▁▁▁▇▇▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁███▁▁▇██▁▁▁▇▇▁▁▇▇▁▁▇▁▁▇▁▁▁▁▁▁▁▇▁▁▁▁▇ ▁
183 ms Histogram: frequency by time 244 ms <
Memory estimate: 227.49 MiB, allocs estimate: 2998074.
So the obvious question here would be:
When creating the @constraint, why do we first create an AffExpr, and then convert it into a MOI.ScalarAffineFunction?
We don't cache the AffExpr, so it plays no role other than to avoid a canonicalization step.
We want to return AffExpr and QuadExpr objects to users when they create them and are visible, but we don't need to if the object will never be returned to the user. It's not always a win; in some cases the cost of a final canonicalization might outweigh the benefit of not constructing the OrderedDict, but we could investigate, at least.
Edit: you could imagine a tag like this to skip the default path:
function create_model_2(I, T = 10000)
model = Model(GLPK.Optimizer)
x = @variable(model, [1:I, 1:T], lower_bound = 0, upper_bound = 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10, SkipJuMPConstructors())
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10, SkipJuMPConstructors())
@objective(model, Min, sum(x))
return model
end
for dicts, we can get a bit with sizehint!
function test_names2(N)
x = Dict{Int,String}()
sizehint!(x, N)
for i in 1:N
x[i] = string("x[", i, "]")
end
return x
end
julia> @benchmark test_names(1_000_000)
BenchmarkTools.Trial: 8 samples with 1 evaluation.
Range (min … max): 612.429 ms … 878.203 ms ┊ GC (min … max): 42.82% … 57.87%
Time (median): 702.485 ms ┊ GC (median): 47.26%
Time (mean ± σ): 708.265 ms ± 86.590 ms ┊ GC (mean ± σ): 47.80% ± 5.93%
█ ██ █ █ █ █ █
█▁▁▁▁██▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁█▁▁▁█▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
612 ms Histogram: frequency by time 878 ms <
Memory estimate: 324.56 MiB, allocs estimate: 5999543.
julia> @benchmark test_names2(1_000_000)
BenchmarkTools.Trial: 8 samples with 1 evaluation.
Range (min … max): 510.264 ms … 719.390 ms ┊ GC (min … max): 23.09% … 49.43%
Time (median): 678.483 ms ┊ GC (median): 45.32%
Time (mean ± σ): 644.980 ms ± 78.198 ms ┊ GC (mean ± σ): 42.50% ± 9.92%
█ █ █ █ █ █ █ █
█▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁█▁▁█▁▁█▁▁█▁▁▁█ ▁
510 ms Histogram: frequency by time 719 ms <
Memory estimate: 293.39 MiB, allocs estimate: 5999496.
Yeah part of the issue is that we add them one-by-one so we don't have this size information.
using ProfileView
using JuMP, Gurobi
function main(I, T)
model = Model(Gurobi.Optimizer)
set_silent(model)
@variable(model, x[1:T, 1:I])
@constraint(model, [i in 1:I, t in 2:T], x[t, i] - x[t-1, i] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[t, i] - x[t-1, i] >= -10)
@objective(model, Min, sum(x))
set_time_limit_sec(model, 0.0)
optimize!(model)
return model
end
@time main(1000, 1000)
@profview main(1000, 1000)

- The long red bar in the middle is
copy_constraints. The orange bar above that is rehashing the dictionary which maps rows toConstraintInfoin Gurobi.jl. we could improve that by switching to CleverDict, which would use a vector instead of a dict. The long green bar to the right isGRBaddconstraint - The two orange bars on the left are creating
OrderedDictfor the JuMP model. - The top red bar on the left is adding variable names
We actually considered the small-dict case a while ago: https://github.com/jump-dev/JuMP.jl/issues/1654
I'm tempted to close this issue. We now have an option to skip the most expensive parts (name creation).
Skipping JuMP-level constructors altogether is too magical. If people are running into performance problems with JuMP, then they should really be using the low-level solver interfaces instead, and even then, they're likely building problems that take a long time to solve and this isn't the bottleneck.
The original problem is now:
julia> using JuMP, Gurobi
julia> function create_model(I, T = 10000)
model = Model(Gurobi.Optimizer; add_bridges = false)
set_time_limit_sec(model, 0.001)
@variable(model, 0 <= x[1:I, 1:T] <= 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
@objective(model, Min, sum(x))
return model
end
create_model (generic function with 2 methods)
julia> function create_model2(I, T = 10000)
model = Model(Gurobi.Optimizer; add_bridges = false)
set_string_names_on_creation(model, false)
set_time_limit_sec(model, 0.001)
@variable(model, 0 <= x[1:I, 1:T] <= 100)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] <= 10)
@constraint(model, [i in 1:I, t in 2:T], x[i, t] - x[i, t-1] >= -10)
@objective(model, Min, sum(x))
return model
end
create_model2 (generic function with 2 methods)
julia> i = 1000;
julia> model = nothing
julia> GC.gc()
julia> @time model = create_model(i);
Set parameter TimeLimit to value 0.001
52.479629 seconds (362.28 M allocations: 24.322 GiB, 43.94% gc time, 3.96% compilation time)
julia> model = nothing
julia> GC.gc()
julia> @time model = create_model2(i);
Set parameter TimeLimit to value 0.001
33.094410 seconds (300.14 M allocations: 21.726 GiB, 25.91% gc time, 0.55% compilation time)
So I think we can safely close this.
We can always re-open if people have evidence of other large-scale problems where this is a bottleneck.