lean4lean icon indicating copy to clipboard operation
lean4lean copied to clipboard

opt: fvar reuse optimization

Open rish987 opened this issue 1 year ago • 0 comments

This PR adds an optimization to the typechecker implementation to make better use of the EquivManager definitional equality cache by reusing free variables where possible.

As a motivating example, suppose we have defeq terms T and S containing free variable x, and we call isDefEq on fun (a : A) (b : B) => T[x := a] and fun (a : A) (b : B) => S[x := a]. This will cache the equality between T[x := f1] and S[x := f2], where f1 and f2 are new free variables generated upon entering the lambda binders for a and b.

However, if we later call isDefEq on fun (b : B) (a : A) => T[x := a] and fun (b : B) (a : A) => S[x := a], we will check the cache for T[x := f3] and S[x := f4], using the newly generated fvars f3 and f4, resulting in a miss. Ideally, we would want to have reused f1 and f2 here so that we could take advantage of the cached knowledge that T[x := f1] and S[x := f2] are defeq.

When entering into a forall or lambda binder of domain type D, it should be sound to reuse an fvar that was previously generated for D in a different context (as long as D has not already been bound in the current context). Any fvars appearing in D would also have to be reused fvars that were generated for previously seen domain types, and so on for the fvars in their types.

I've implemented this by replacing the NameGenerator in the typechecker state with a simple counter, and using a hashmap from domain type expressions to previously generated fvar names that is checked whenever a new fvar name needs to be generated upon entering a lambda or forall binder. With this optimization, Mathlib typechecks about 20% faster when Lean4Lean is run sequentially.

rish987 avatar Nov 05 '24 15:11 rish987