julia Allow inlining methods with unmatched type parameters

Rebase of #44656 to solve some inference regressions in #44803

As Keno predicted, there are some regressions in this PR as well, including quite a few test suite failures that I will be fixing up.

Additionally, some needed future work would be to teach SROA how to fold svec_ref more generally, as it is currently very specific

The original description follows below.

Currently we do not allow inlining any methods that have unmatched type parameters. The original reason for this restriction is that I didn't really know what to put for an inlined :static_parameter, so I just had inlining bail. As a result, in code like:

f(x) = Val{x}()

the call to Val{x}() would not be inlined unless x was known through constant propagation.

This PR attempts to remidy that. A new builtin is added that computes the static parameters for a given method/argument list. Additionally, sroa gains the ability to simplify and fold this builtin. As a result, inlining can insert an expression that computes the correct values for the inlinees static parameters.

The change benchmarks favorably:

Before:

julia> function foo()
          for i = 1:10000
              Base.donotdelete(Val{i}())
          end
       end
foo (generic function with 1 method)

julia> @time foo()
  0.375567 seconds (4.24 M allocations: 274.440 MiB, 14.67% gc time, 72.96% compilation time)

julia> @time foo()
  0.012387 seconds (9.49 k allocations: 148.266 KiB)

After:

julia> function foo()
          for i = 1:10000
              Base.donotdelete(Val{i}())
          end
       end
foo (generic function with 1 method)

julia> @time foo()
  0.003058 seconds (29.47 k allocations: 1.546 MiB)

julia> @time foo()
  0.001200 seconds (9.49 k allocations: 148.266 KiB)

Note that this particular benchmark could also be fixed by #44654, but this change is more general.

There is a potential downside, which is that we remove a specialization barrier here. We already do that in the case when all type parameters are matched, so it's not eggregious. However, there is anectodal knowledge in the community that extra type parameters force specialization. Some of that is due to the handling of type parameters in the specialization code, but some of it might also be due to inlining's prior refusal to perform this inlining. We'll have to keep an eye out for any regressions.

Apr 22 '22 22:04 ianatol

Re: partial reverting of #44512:

choltype from LinearAlgebra was causing BoundsErrors during test suite runs.

Looking further, it appeared that in the following IR,

Expr(:call, Main.eltype, Core.Argument(n=2)),
  Expr(:call, Main.oneunit, SSAValue(1)),
  Expr(:call, Main.sqrt, SSAValue(2)),
  Expr(:call, Main.typeof, SSAValue(3)),
  Expr(:call, Main.promote_type, SSAValue(4), Main.Float32),
  Core.ReturnNode(val=SSAValue(5))]

we were choosing to inline promote_type with the following match: promote_type(DataType, Type{Float32}) from promote_type(Type{T}, Type{T}) where {T}.

This triggered _compute_sparams where m = promote_type(Type{T}, Type{T}) where {T} tt = Tuple{typeof(Base.promote_type), Type{Float64}, Type{Float32}}, which resulted in env = svec(), and our BoundsError.

Keno helped me to figure out that this was due to inlining not being able to discern which signature was right due to the spec_types of promote_type(Type{T}, Type{T}) where {T} and promote_type(Type{T}, Type{S}) where {T,S} being identical in our case. However, DataType in promote_type(DataType, Type{Float32}) would later become Float64, meaning that this selection was actually incorrect.

By bringing back this branch that was removed in #44512, we were able to be a bit more fine-grained about when we could bypass validating sparams (as well as choosing not to inline when we had a direct match of sig and spec_types).

Apr 27 '22 02:04 ianatol

@nanosoldier runtests(ALL, vs = ":master")

Apr 27 '22 20:04 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Apr 28 '22 03:04 nanosoldier

@nanosoldier runtests(["AlgebraicPetri", "AlphaStableDistributions", "Earth2014", "Evolutionary", "FameSVD", "FlashWeave", "FractionalSystems", "GeoDatasets", "GeometricFlux", "Glimmer", "GraphSignals", "GridapEmbedded", "GridapGmsh", "Hecke", "Infernal", "InformationGeometry", "IntensityScans", "JSONLines", "JetPack", "Kahuna", "LoggingExtras", "MatrixMarket", "Metida", "MultiScaleTreeGraph", "NeuralArithmetic", "NeuralOperators", "ODEInterface", "ODEInterfaceDiffEq", "OptimKit", "OptimizationAlgorithms", "Oracle", "Org", "PDENLPModels", "ParallelAnalysis", "PhaseSpaceTools", "Plasma", "PoreMatMod", "Quadrature", "Relief", "StochasticRounding", "VIDA", "ValueShapes", "YAAD"], vs = ":master")

Apr 28 '22 16:04 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Apr 28 '22 17:04 nanosoldier

GeometricFlux, GraphSignals, GridapEmbedded, GridapGmsh, JetPack, NeuralOperators, Org, PDENLPModels, and ValueShapes are all caused by the same issue with FillArrays where we end up with compute_sparams on

Tuple{Type{FillArrays.Fill{T, N, Axes} where Axes where N where T}, T, Vararg{Integer, N}} where N where T and Tuple{Type{FillArrays.Fill{T, N, Axes} where Axes where N where T}, Float32, Tuple{Int64}}

Seems similar to the case listed above, but still investigating where a possibly incorrect inlining choice happens

Apr 28 '22 22:04 ianatol

@nanosoldier runtests(["Evolutionary", "GeometricFlux", "GraphSignals", "GridapEmbedded", "GridapGmsh", "Hecke", "InformationGeometry", "JetPack", "Metida", "NeuralOperators", "ODEInterface", "ODEInterfaceDiffEq", "Org", "PDENLPModels", "ValueShapes"], vs = ":master")

May 02 '22 21:05 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

May 02 '22 23:05 nanosoldier

@nanosoldier runtests(["Hecke", "Metida", "ODEInterface", "Org"], vs = ":master")

May 05 '22 00:05 ianatol

@aviatesk Could I ask for your review here? Specifically on 3a6b36c where I revert some of #44512, as well as on 884e5ea, where I somewhat broadly restrict the cases in which we can perform this inlining. It feels like we could do better, but I'm not familiar enough with this part of the codebase to fully understand

May 05 '22 00:05 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

May 05 '22 01:05 nanosoldier

Re: test failures in Metida and ODEInterface:

The ccall and cfunctions showing up in the test aren't having their types properly calculated during inlining since we are entering ssa_substitute_op! while spvals::SSAValue.

My idea is that we can do something similar to svec_ref, where we insert a node (in this case a call to jl_instantiate_type_in_env) where spvals is still an SSAValue. Then, before emitting the substituted ccall, we replace spvals in our inserted nodes with our calculation of the intersection.

May 05 '22 22:05 ianatol

@aviatesk Could I ask for your review here? Specifically on 3a6b36c where I revert some of #44512, as well as on 884e5ea, where I somewhat broadly restrict the cases in which we can perform this inlining. It feels like we could do better, but I'm not familiar enough with this part of the codebase to fully understand

Sure, I will look into this later today or the weekend.

May 06 '22 05:05 aviatesk

@nanosoldier runbenchmarks(!"scalar", vs=":master")

May 06 '22 09:05 aviatesk

By bringing back this branch that was removed in #44512, we were able to be a bit more fine-grained about when we could bypass validating sparams (as well as choosing not to inline when we had a direct match of sig and spec_types).

#44512 removed the previous (somewhat random imho) special casings, and you can recover some if needed. But I'd like to revert it in a way at least we understand which optimizable cases handled by #44512 are now ignored, and which are still optimized. Rather as far as I understand I believe what essentially needed is the only_method information specifically, and it can co-exist with the code after #44512?

May 06 '22 09:05 aviatesk

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

May 06 '22 15:05 nanosoldier

Seems to be passing CI as well as locally passing previously problematic package tests (Metida, Hecke, etc).

I'd still like to do better on 4934be's implementation, as well as add some more tests, but this could use PkgEval again as there have been a few changes since the last one.

@nanosoldier runtests(ALL, vs = ":master")

Jul 13 '22 23:07 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Jul 14 '22 09:07 nanosoldier

@nanosoldier runtests(["Alicorn", "AlphaStableDistributions", "AprilTags", "AstroChemistry", "DistributedArrays", "EBayes", "EnsembleKalmanProcesses", "Evolutionary", "GadgetIO", "GasChromatographySimulator", "Hashpipe", "HetaSimulator", "ITensorTDVP", "JumpProcesses", "LibExpat", "ManifoldDiffEq", "NumericalAlgorithms", "OpenTelemetryProto", "OptimKit", "OptimizationMetaheuristics", "Oscar", "Reproject", "Retriever", "SimplexGridFactory", "SymbolicUtils"], vs = ":master")

Jul 14 '22 15:07 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Jul 15 '22 03:07 nanosoldier

Still need to add more tests, but could I get another review @aviatesk?

Aug 01 '22 21:08 ianatol

Haven't been able to recreate the SparseArrays/umfpack error locally, so I guess it may be unrelated? Either way, this should be ready for another review when you get a chance! @aviatesk

Aug 09 '22 16:08 ianatol

What's the status of this regression?

As far as I can tell, there is no longer a regression. I will try another PkgEval today, though the logic hasn't changed too much since the last one that came back clean (and I've been checking previously problematic packages locally). The regression Keno was talking about mostly manifested in some test suite failures, which are all resolved now.

Aug 10 '22 14:08 ianatol

@nanosoldier runtests(ALL, vs = ":master")

Aug 19 '22 21:08 ianatol

@maleadt is PkgEval running on this? It didn't show up in the status list.

Aug 20 '22 01:08 Keno

@nanosoldier runtests(ALL, vs = ":master")

Aug 20 '22 06:08 ianatol

Maybe because your membership isn't public? Let's see if I trigger

@nanosoldier runtests(ALL, vs = ":master")

I just annoyingly lost sudo access to the PkgEval machine, so can't debug this right now.

Aug 20 '22 07:08 maleadt

My membership to JuliaLang and JuliaComputing are both public (though that definitely caught me in the past lol). It still doesn't seem to be running, so I guess there's some other issue?

Aug 20 '22 16:08 ianatol

It still doesn't seem to be running, so I guess there's some other issue?

Yeah, the machine PkgEval is running on lost its DNS configuration. I think that killed Kerberos, locking me mostly out. So this may take a couple of days.

Aug 20 '22 20:08 maleadt

Should be working again:

@nanosoldier runtests(ALL, vs = ":master")

Aug 22 '22 12:08 maleadt

Sorry, restarting the job so that it uses a newer rootfs to build Julia (fixing certain package issues):

@nanosoldier runtests(ALL, vs = ":master")

Aug 22 '22 13:08 maleadt

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Aug 22 '22 22:08 nanosoldier

Hmm, seems we aren't detecting foreign calls to disable this kind of inlining correctly. These other failures I'm not as sure about. Hopefully they go away on a retry:

@nanosoldier runtests(["AbstractAlgebra", "BenchmarkFunctions", "BenchmarkProfiles", "Binscatters", "CairoMakie", "CalibrationAnalysis", "Compat", "CopEnt", "CountdownNumbers", "Enigma", "Evolutionary", "FMI", "FlameGraphs", "Folds", "ForwardDiff", "FunctionOperators", "ImageGeoms", "InformationInequalities", "KitePodModels", "MatrixProfile", "Mazes", "MultiplesOfPi", "NODAL", "NeuralDynamics", "ODEInterface", "ODEInterfaceDiffEq", "OptimKit", "Org", "PGENFiles", "ParameterEstimateScatterPlots", "Pidfile", "PlantGeom", "PlatformAware", "ProxSDP", "QuadEig", "QuadraticToBinary", "QuantumTomography", "ReinforcementLearningExperiments", "RetroCap", "RigidBodyTools", "RootedTrees", "RoundAndSwap", "SDMX", "SignalTablesInterface_WGLMakie", "SolverBenchmark", "SpatialEcology", "SteadyStateDiffEq", "StochasticProcesses", "StrBase", "TensorBoardLogger", "TensorGames", "Transits"], vs = ":master")

Aug 22 '22 22:08 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Aug 23 '22 10:08 nanosoldier

@nanosoldier runtests(["CairoMakie", "Compat", "CountdownNumbers", "FlameGraphs", "Folds", "MultiplesOfPi", "ODEInterface", "ODEInterfaceDiffEq", "Org", "PGENFiles", "Pidfile", "PlatformAware", "QuantumTomography", "ReinforcementLearningExperiments", "RetroCap", "RoundAndSwap", "SteadyStateDiffEq", "StrBase", "TensorBoardLogger", "Transits"], vs = ":master")

Aug 24 '22 18:08 ianatol

Your package evaluation job has completed - possible new issues were detected. A full report can be found here.

Aug 24 '22 23:08 nanosoldier

Of the few remaining PkgEval errors, I am unable to recreate them on multiple machines locally and they seem fairly unrelated. I think this is good to go.

Aug 25 '22 00:08 ianatol

@nanosoldier runbenchmarks("inference", vs=":master")

Aug 25 '22 04:08 aviatesk

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Aug 25 '22 04:08 nanosoldier

Congrats, @ianatol !

Aug 26 '22 00:08 aviatesk

julia julia copied to clipboard

Allow inlining methods with unmatched type parameters

julia
julia copied to clipboard