MathOptInterface.jl
MathOptInterface.jl copied to clipboard
Tracking the time-to-first-solve issue
We know we have a time-to-first-solve problem. There have been a few different attempts at solving it (#1156, #1249, #1251, #1252), but they're getting a bit scattered, so I thought I would open an issue to track progress.
Currently, things are going in the wrong direction.
I'm going to argue that this benchmark is very important to users, particularly new users, and we should put a high priority on improving this. The script is setup to replicate how JuMP builds MOI models.
1.1.0
(base) oscar@Oscars-MBP time_to_first_solve % ~/julia --project=. script.jl clp
Running: clp
10.488984 seconds (28.36 M allocations: 1.654 GiB, 5.22% gc time, 15.34% compilation time)
0.001143 seconds (1.97 k allocations: 161.828 KiB)
(base) oscar@Oscars-MBP time_to_first_solve % ~/julia --project=. script.jl clp --no-bridge
Running: clp --no-bridge
4.820104 seconds (10.02 M allocations: 628.892 MiB, 4.53% gc time, 38.05% compilation time)
0.001012 seconds (2.43 k allocations: 221.797 KiB)
(base) oscar@Oscars-MBP time_to_first_solve % ~/julia --project=. script.jl glpk
Running: glpk
6.541374 seconds (17.35 M allocations: 1.026 GiB, 5.63% gc time, 28.40% compilation time)
0.000534 seconds (1.25 k allocations: 84.609 KiB)
(base) oscar@Oscars-MBP time_to_first_solve % ~/julia --project=. script.jl glpk --no-bridge
Running: glpk --no-bridge
4.013187 seconds (5.79 M allocations: 342.078 MiB, 4.16% gc time, 99.94% compilation time)
0.000375 seconds (680 allocations: 65.672 KiB)
0.10.6
(base) oscar@Oscars-MBP perf % ~/julia --project=. time_to_first_solve.jl clp
Running: clp
11.682708 seconds (27.87 M allocations: 1.710 GiB, 7.23% gc time, 11.85% compilation time)
0.001102 seconds (3.48 k allocations: 303.906 KiB)
(base) oscar@Oscars-MBP perf % ~/julia --project=. time_to_first_solve.jl clp --no-bridge
Running: clp --no-bridge
5.788850 seconds (12.06 M allocations: 744.691 MiB, 5.39% gc time, 13.72% compilation time)
0.001066 seconds (2.48 k allocations: 241.484 KiB)
(base) oscar@Oscars-MBP perf % ~/julia --project=. time_to_first_solve.jl glpk
Running: glpk
7.302795 seconds (18.49 M allocations: 1.082 GiB, 6.49% gc time, 22.80% compilation time)
0.000359 seconds (1.22 k allocations: 84.297 KiB)
(base) oscar@Oscars-MBP perf % ~/julia --project=. time_to_first_solve.jl glpk --no-bridge
Running: glpk --no-bridge
4.450319 seconds (8.48 M allocations: 498.214 MiB, 4.55% gc time, 99.97% compilation time)
0.000252 seconds (634 allocations: 67.031 KiB)
0.9.22
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp
Running: clp
16.711929 seconds (36.79 M allocations: 2.105 GiB, 5.74% gc time, 45.43% compilation time)
0.001412 seconds (4.64 k allocations: 345.617 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl clp --no-bridge
Running: clp --no-bridge
10.783137 seconds (18.47 M allocations: 1.050 GiB, 5.39% gc time, 96.52% compilation time)
0.000804 seconds (2.09 k allocations: 177.289 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk
Running: glpk
13.895745 seconds (31.22 M allocations: 1.786 GiB, 6.35% gc time, 53.08% compilation time)
0.000603 seconds (2.71 k allocations: 186.906 KiB)
(base) oscar@Oscars-MBP auto-cache % ~/julia --project=. bench.jl glpk --no-bridge
Running: glpk --no-bridge
10.406834 seconds (20.02 M allocations: 1.142 GiB, 5.85% gc time, 99.96% compilation time)
0.000545 seconds (2.01 k allocations: 162.797 KiB)
0.9.21
Running clp
21.338810 seconds (53.80 M allocations: 3.018 GiB, 6.56% gc time, 37.14% compilation time)
0.001744 seconds (5.96 k allocations: 535.180 KiB)
Running clp --no-bridge
10.575496 seconds (18.97 M allocations: 1.078 GiB, 5.09% gc time, 96.11% compilation time)
0.000997 seconds (3.02 k allocations: 309.555 KiB)
Running glpk
14.620199 seconds (33.51 M allocations: 1.916 GiB, 6.58% gc time, 53.13% compilation time)
0.000694 seconds (3.18 k allocations: 250.125 KiB)
Running glpk --no-bridge
11.557972 seconds (20.52 M allocations: 1.170 GiB, 4.85% gc time, 99.98% compilation time)
0.000487 seconds (2.52 k allocations: 234.297 KiB)
0.9.20
Running clp
15.997227 seconds (47.68 M allocations: 2.716 GiB, 9.29% gc time, 19.53% compilation time)
0.001548 seconds (3.43 k allocations: 279.242 KiB)
Running clp --no-bridge
5.052817 seconds (12.42 M allocations: 732.762 MiB, 6.70% gc time, 90.03% compilation time)
0.000808 seconds (1.84 k allocations: 174.609 KiB)
Running glpk
11.157795 seconds (35.30 M allocations: 2.023 GiB, 8.41% gc time, 27.83% compilation time)
0.000472 seconds (1.79 k allocations: 125.258 KiB)
Running glpk --no-bridge
7.441518 seconds (21.76 M allocations: 1.248 GiB, 6.89% gc time, 90.56% compilation time)
0.000337 seconds (1.39 k allocations: 127.336 KiB)
Script
setup.jl
import Pkg
if ARGS[1] == "v0.9.20"
Pkg.add([
Pkg.PackageSpec(name = "Clp"),
Pkg.PackageSpec(name = "GLPK"),
Pkg.PackageSpec(name = "MathOptInterface", version = v"0.9.20"),
])
elseif ARGS[1] == "v0.9.21"
Pkg.add([
Pkg.PackageSpec(name = "Clp"),
Pkg.PackageSpec(name = "GLPK"),
Pkg.PackageSpec(name = "MathOptInterface", version = v"0.9.21"),
])
elseif ARGS[1] == "master"
Pkg.add([
Pkg.PackageSpec(name = "Clp", rev = "od/moi10"),
Pkg.PackageSpec(name = "GLPK", rev = "od/moi10"),
Pkg.PackageSpec(name = "MathOptInterface", rev = "master"),
])
end
bench.jl
using MathOptInterface, GLPK, Clp
const MOI = MathOptInterface
function example_diet(optimizer, bridge)
category_data = [
1800.0 2200.0;
91.0 Inf;
0.0 65.0;
0.0 1779.0
]
cost = [2.49, 2.89, 1.50, 1.89, 2.09, 1.99, 2.49, 0.89, 1.59]
food_data = [
410 24 26 730;
420 32 10 1190;
560 20 32 1800;
380 4 19 270;
320 12 10 930;
320 15 12 820;
320 31 12 1230;
100 8 2.5 125;
330 8 10 180
]
bridge_model = if bridge
MOI.instantiate(optimizer; with_bridge_type=Float64)
else
MOI.instantiate(optimizer)
end
model = MOI.Utilities.CachingOptimizer(
MOI.Utilities.UniversalFallback(MOI.Utilities.Model{Float64}()),
MOI.Utilities.AUTOMATIC,
)
MOI.Utilities.reset_optimizer(model, bridge_model)
MOI.set(model, MOI.Silent(), true)
nutrition = MOI.add_variables(model, size(category_data, 1))
for (i, v) in enumerate(nutrition)
MOI.add_constraint(
model,
MOI.SingleVariable(v),
MOI.GreaterThan(category_data[i, 1]),
)
MOI.add_constraint(
model,
MOI.SingleVariable(v),
MOI.LessThan(category_data[i, 2]),
)
end
buy = MOI.add_variables(model, size(food_data, 1))
MOI.add_constraint.(model, MOI.SingleVariable.(buy), MOI.GreaterThan(0.0))
MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)
f = MOI.ScalarAffineFunction(MOI.ScalarAffineTerm.(cost, buy), 0.0)
MOI.set(model, MOI.ObjectiveFunction{typeof(f)}(), f)
for (j, n) in enumerate(nutrition)
f = MOI.ScalarAffineFunction(
MOI.ScalarAffineTerm.(food_data[:, j], buy),
0.0,
)
push!(f.terms, MOI.ScalarAffineTerm(-1.0, n))
MOI.add_constraint(model, f, MOI.EqualTo(0.0))
end
MOI.optimize!(model)
term_status = MOI.get(model, MOI.TerminationStatus())
@assert term_status == MOI.OPTIMAL
MOI.add_constraint(
model,
MOI.ScalarAffineFunction(
MOI.ScalarAffineTerm.(1.0, [buy[end-1], buy[end]]),
0.0,
),
MOI.LessThan(6.0),
)
MOI.optimize!(model)
@assert MOI.get(model, MOI.TerminationStatus()) == MOI.INFEASIBLE
return
end
if length(ARGS) > 0
bridge = get(ARGS, 2, "") != "--no-bridge"
println("Running: $(ARGS[1]) $(get(ARGS, 2, ""))")
if ARGS[1] == "clp"
@time example_diet(Clp.Optimizer, bridge)
@time example_diet(Clp.Optimizer, bridge)
else
@assert ARGS[1] == "glpk"
@time example_diet(GLPK.Optimizer, bridge)
@time example_diet(GLPK.Optimizer, bridge)
end
exit(0)
end
run.sh
function run_benchmark() {
~/julia --project=. --depwarn=error bench.jl clp
~/julia --project=. --depwarn=error bench.jl clp --no-bridge
~/julia --project=. --depwarn=error bench.jl glpk
~/julia --project=. --depwarn=error bench.jl glpk --no-bridge
}
~/julia --project=. setup.jl v0.9.20
run_benchmark
~/julia --project=. setup.jl v0.9.21
run_benchmark
~/julia --project=. setup.jl master
run_benchmark
The bridging case
include("bench.jl")
using SnoopCompile
tinf = @snoopi_deep example_diet(Clp.Optimizer, true)
julia> tinf
InferenceTimingNode: 11.698169/22.652688 on InferenceFrameInfo for Core.Compiler.Timings.ROOT() with 107 direct children
julia> staleinstances(tinf)
SnoopCompileCore.InferenceTiming[]
This call took 22.65 seconds, of which 11.7 seconds was on something other than inference (e.g., code gen). But that means 11 seconds was spend on type inference! This is a good target to attack.
julia> using ProfileView
julia> fg = flamegraph(tinf)
Node(FlameGraphs.NodeData(ROOT() at typeinfer.jl:73, 0x00, 0:22666765718))
julia> ProfileView.view(fg)
Most of this is time spent inferring copy_to (as expected), but also a whole lot about deleting objective bridges? That's due to
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Bridges/bridge_optimizer.jl#L882-L891
In order to infer how to set the objective sense, we also need to infer how to delete an objective bridge, which involves a whooooole lot of stuff.
Moreover, this function is red because the method is owned by MOI.Bridges, but it uses a type owned by Clp.
The big stack on top of the red is:
Base.precompile(Tuple{typeof(MathOptInterface.delete),UniversalFallback{GenericModel{Float64, ModelFunctionConstraints{Float64}}},MathOptInterface.VariableIndex})
but this didn't seem to help.
Let's look at inference triggers:
julia> itrigs = inference_triggers(tinf)
106-element Vector{InferenceTrigger}:
Inference triggered to call MethodInstance for hvcat(::NTuple{4, Int64}, ::Float64, ::Vararg{Float64, N} where N) from example_diet (/Users/oscar/Documents/JuMP/performance/auto-cache/bench.jl:5) with specialization MethodInstance for example_diet(::Type, ::Bool)
Inference triggered to call MethodInstance for promote_typeof(::Int64, ::Int64, ::Vararg{Any, N} where N) from hvcat (./abstractarray.jl:1931) inlined into MethodInstance for example_diet(::Type, ::Bool) (/Users/oscar/Documents/JuMP/performance/auto-cache/bench.jl:12)
Inference triggered to call MethodInstance for promote_typeof(::Int64, ::Float64, ::Vararg{Any, N} where N) from promote_typeof (./promotion.jl:272) with specialization MethodInstance for promote_typeof(::Int64, ::Int64, ::Vararg{Any, N} where N)
Inference triggered to call MethodInstance for promote_typeof(::Float64, ::Int64, ::Vararg{Any, N} where N) from promote_typeof (./promotion.jl:272) with specialization MethodInstance for promote_typeof(::Int64, ::Float64, ::Vararg{Any, N} where N)
Inference triggered to call MethodInstance for typed_hvcat(::Type{Float64}, ::NTuple{9, Int64}, ::Int64, ::Vararg{Number, N} where N) from hvcat (./abstractarray.jl:1931) inlined into MethodInstance for example_diet(::Type, ::Bool) (/Users/oscar/Documents/JuMP/performance/auto-cache/bench.jl:12)
Inference triggered to call MethodInstance for var"#instantiate#23"(::Type{Float64}, ::Bool, ::typeof(MathOptInterface.instantiate), ::Type) from instantiate##kw (/Users/oscar/.julia/dev/MathOptInterface/src/instantiate.jl:120) with specialization MethodInstance for (::MathOptInterface.var"#instantiate##kw")(::NamedTuple{(:with_bridge_type,), Tuple{DataType}}, ::typeof(MathOptInterface.instantiate), ::Type)
[...]
julia> mtrigs = accumulate_by_source(Method, itrigs)
41-element Vector{SnoopCompile.TaggedTriggers{Method}}:
get(model::Union{MathOptInterface.Utilities.AbstractModelLike{T}, MathOptInterface.Utilities.AbstractOptimizer{T}} where T, ::MathOptInterface.ListOfVariableAttributesSet) in MathOptInterface.Utilities at /Users/oscar/.julia/dev/MathOptInterface/src/Utilities/model.jl:239 (1 callees from 1 callers)
(::MathOptInterface.var"#instantiate##kw")(::Any, ::typeof(MathOptInterface.instantiate), optimizer_constructor) in MathOptInterface at /Users/oscar/.julia/dev/MathOptInterface/src/instantiate.jl:115 (1 callees from 1 callers)
attach_optimizer(model::MathOptInterface.Utilities.CachingOptimizer) in MathOptInterface.Utilities at /Users/oscar/.julia/dev/MathOptInterface/src/Utilities/cachingoptimizer.jl:181 (1 callees from 1 callers)
empty!(model::Clp.Optimizer) in Clp at /Users/oscar/.julia/packages/Clp/E3N8m/src/MOI_wrapper/MOI_wrapper.jl:79 (1 callees from 1 callers)
[...]
The non bridging case
These should be fixable. They're all ModelVectorConstraint objects.

@blegat, so a lot of the slow down is coming because the design of StructOfConstraints forces the compiler to go over every possible combination of function and set.
Something like ListOfConstraintTypesPresent is particularly bad.
We start here:
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Utilities/model.jl#L628-L646
The first line calls the following (and the copy can be removed):
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Utilities/struct_of_constraints.jl#L83-L90
which calls
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Utilities/struct_of_constraints.jl#L245-L247
going through every field (also a StructOfConstraints) and eventually calling:
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Utilities/vector_of_constraints.jl#L117-L122
So in total, there are 97 leaf [] created, which get vcatd into 5 arrays for the functions, which get vcatd into a single ListOfConstraintTypesPresent. That also means inferring and compiling 97 versions of
https://github.com/jump-dev/MathOptInterface.jl/blob/2e99e255d435f556360ac9df3762b6026f7c5447/src/Utilities/vector_of_constraints.jl#L117-L122
How much of this can be addressed by precompile statements?
For some reason the precompile statements don't help. If you generate them, add them to the package, they're still called at runtime. From https://timholy.github.io/SnoopCompile.jl/stable/snoopi_deep_parcel/:

But one of the methods that failed is
julia> /Users/oscar/.julia/dev/MathOptInterface/src/Utilities/struct_of_constraints.jl:313, MethodInstance for (::Type{MathOptInterface.Utilities.ModelVectorConstraints{_A, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.Reals}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.Zeros}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.Nonnegatives}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.Nonpositives}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.Complements}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.NormInfinityCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.NormOneCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.SecondOrderCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.RotatedSecondOrderCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.GeometricMeanCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.ExponentialCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.DualExponentialCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.RelativeEntropyCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.NormSpectralCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.NormNuclearCone}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.PositiveSemidefiniteConeTriangle}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.PositiveSemidefiniteConeSquare}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.RootDetConeTriangle}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.RootDetConeSquare}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.LogDetConeTriangle}, MathOptInterface.Utilities.VectorOfConstraints{MathOptInterface.VectorOfVariables, MathOptInterface.LogDetConeSquare}, _B, _C, _D, _E, _F, _G}} where {_A, _B, _C, _D, _E, _F, _G})()
which suggests it's failing to infer the element type _A=Float64 and calling dynamic dispatch.
(Edit: well wrong on the previous comment. Now deleted.)
One cause of the slow-down is due to
julia> using MathOptInterface, Clp
julia> @time @eval MathOptInterface.Utilities.Model{Float64}()
4.884869 seconds (11.88 M allocations: 694.229 MiB, 7.17% gc time, 100.03% compilation time)
MOIU.GenericModel{Float64, MOIU.ModelFunctionConstraints{Float64}}
julia> using MathOptInterface
julia> @time @eval MathOptInterface.Utilities.Model{Float64}()
0.294457 seconds (539.43 k allocations: 33.832 MiB, 26.33% gc time, 100.77% compilation time)
MOIU.Model{Float64}
In order to infer how to set the objective sense, we also need to infer how to delete an objective bridge, which involves a whooooole lot of stuff.
That's interesting. So the fact that all constraints are supported and we see that at compile time indeed allows to avoid compiling on the constraint bridging. For the objective, setting an objective should remove the old one and when you set a given objective, you don't know from the types of the current objective whether the previous one is bridged.
Hmm. I'm stuck on handling deletion in UniversalFallbackUtilities.Model:
julia> using MathOptInterface
julia> const MOI = MathOptInterface
MathOptInterface
julia> model = MOI.Utilities.Model{Float64}()
MOIU.GenericModel{Float64, MOIU.ModelFunctionConstraints{Float64}}
julia> x = MOI.add_variable(model)
MathOptInterface.VariableIndex(1)
julia> @time @eval MOI.delete(model, x)
11.937550 seconds (17.38 M allocations: 957.306 MiB, 3.09% gc time, 100.01% compilation time)
Fixed in #1315.
This biggest issue now is the inferrability of default_copy_to:
https://github.com/jump-dev/MathOptInterface.jl/blob/85f97037e296e898389cb5086f4e77800ed52007/src/Utilities/copy.jl#L638-L650
Things like this are quite bad https://github.com/jump-dev/MathOptInterface.jl/blob/85f97037e296e898389cb5086f4e77800ed52007/src/Utilities/copy.jl#L457-L509
Removing this from 0.10. We've made some good progress, and there is a plan to improve things further in the next JuMP release.
Take a look at https://github.com/JuliaDiff/ChainRules.jl/pull/499
A big offender is:
using SnoopCompile
using GLPK; const MOI = GLPK.MOI
model = MOI.instantiate(GLPK.Optimizer; with_bridge_type = Float64)
f(model) = MOI.set(model, MOI.ObjectiveSense(), MOI.MIN_SENSE)
tinf = @snoopi_deep f(model)
itrigs = inference_triggers(tinf)
mtrigs = accumulate_by_source(Method, itrigs)
using ProfileView
fg = flamegraph(tinf)
ProfileView.view(fg)
Closing because things are now muuuuch better (with Julia 1.9)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl clp
Running: clp
3.103880 seconds (3.67 M allocations: 246.743 MiB, 4.04% gc time, 99.69% compilation time: <1% of which was recompilation)
0.001400 seconds (2.95 k allocations: 153.984 KiB)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl clp --no-bridge
Running: clp --no-bridge
1.396099 seconds (1.39 M allocations: 95.472 MiB, 3.82% gc time, 99.63% compilation time)
0.001089 seconds (1.74 k allocations: 122.922 KiB)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl glpk
Running: glpk
1.899633 seconds (2.60 M allocations: 179.593 MiB, 4.50% gc time, 99.74% compilation time: <1% of which was recompilation)
0.000728 seconds (2.26 k allocations: 89.062 KiB)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl glpk --no-bridge
Running: glpk --no-bridge
0.525569 seconds (634.41 k allocations: 43.383 MiB, 5.80% gc time, 99.73% compilation time)
0.000331 seconds (532 allocations: 58.844 KiB)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl highs
Running: highs
0.658714 seconds (797.60 k allocations: 54.717 MiB, 98.87% compilation time: 31% of which was recompilation)
0.002237 seconds (2.27 k allocations: 89.609 KiB)
(base) oscar@Oscars-MBP /tmp % julia +1.9 --project=. --depwarn=error bench.jl highs --no-bridge
Running: highs --no-bridge
1.474796 seconds (1.52 M allocations: 104.449 MiB, 2.58% gc time, 99.77% compilation time)
0.001385 seconds (483 allocations: 53.781 KiB)
We can continue to improve by adding PrecompileTools, but let's use #2226 as that issue.