Compilation time regression in 1.11
I've boiled down a reproducer below, I can try to simplify it more, but this is as far as I've gotten.
julia 1.10
dot: 0.311097 seconds (2.51 M allocations: 157.250 MiB, 10.19% gc time, 99.98% compilation time)
julia 1.11
dot: 0.978774 seconds (13.59 M allocations: 612.654 MiB, 19.87% gc time, 99.99% compilation time)
reproducer
#=
julia +1.10 --project
julia +1.11 --project
using TestEnv; TestEnv.activate() # if you don't have Test in your env.
using Revise; include("../julia_111_perf_regression_reproducer2.jl")
=#
import ClimaComms
ClimaComms.@import_required_backends
using Test
import Random: seed!
using ClimaCore
import ClimaCore:
Geometry,
Domains,
Meshes,
Topologies,
Hypsography,
Spaces,
Fields,
Operators,
Quadratures
using ClimaCore.MatrixFields
function test_spaces(::Type{FT}) where {FT}
velem = 20
helem = npoly = 1
comms_device = ClimaComms.device()
comms_ctx = ClimaComms.SingletonCommsContext(comms_device)
hdomain = Domains.SphereDomain(FT(10))
hmesh = Meshes.EquiangularCubedSphere(hdomain, helem)
htopology = Topologies.Topology2D(comms_ctx, hmesh)
quad = Quadratures.GLL{npoly + 1}()
hspace = Spaces.SpectralElementSpace2D(htopology, quad)
vdomain = Domains.IntervalDomain(
Geometry.ZPoint{FT}(0),
Geometry.ZPoint{FT}(10);
boundary_names = (:bottom, :top),
)
vmesh = Meshes.IntervalMesh(vdomain, nelems = velem)
vtopology = Topologies.IntervalTopology(comms_ctx, vmesh)
vspace = Spaces.CenterFiniteDifferenceSpace(vtopology)
sfc_coord = Fields.coordinate_field(hspace)
hypsography = Hypsography.Flat()
center_space = Spaces.ExtrudedFiniteDifferenceSpace(hspace, vspace, hypsography)
face_space = Spaces.FaceExtrudedFiniteDifferenceSpace(center_space)
return center_space, face_space
end
function random_field(::Type{T}, space) where {T}
FT = Spaces.undertype(space)
field = Fields.Field(T, space)
parent(field) .= rand.(FT)
return field
end
@testset "Matrix Multiplication At Boundaries" begin
FT = Float64
center_space, face_space = test_spaces(FT)
seed!(1) # ensures reproducibility
ᶜᶜmatrix_with_outside_entries = random_field(TridiagonalMatrixRow{FT}, center_space)
ᶜᶜmatrix_without_outside_entries = random_field(DiagonalMatrixRow{FT}, center_space)
@time "dot" @. ᶜᶜmatrix_without_outside_entries ⋅ ᶜᶜmatrix_with_outside_entries
end
I noticed even worse regressions (in the order of 50x) in compiling some functions in JET. x-ref https://github.com/aviatesk/JET.jl/issues/649
As shown in more detail in https://github.com/aviatesk/JET.jl/issues/649#issuecomment-2413759341, the compile time regression does not seem to have been fixed at all by 1.11.1.
Seeing this on SymbolicRegression.jl too. The latency is so large I had to switch back to 1.10 for development.
Please open a new issue with a reproducer there to keep things separate and actionable.
Is anyone working on this? The new circular dependency warnings in 1.10 are a bit annoying, but this slowdown is not great either.
Since the JET issue is that JET intentionally prevents (or rather does not activate) compilation caching in v1.11, that isn't really a comparable case, since the package doesn't run the same code.
The ClimaCore load performance seems slower again on v1.12 / master:
julia> @time "dot" @. ᶜᶜmatrix_without_outside_entries * ᶜᶜmatrix_with_outside_entries;
dot: 1.974254 seconds (13.16 M allocations: 537.688 MiB, 20.03% gc time, 99.87% compilation time)
It would probably be necessary to look at SnoopCompile (or similar options such as --trace-compile) to see where the time is getting lost