lean4
lean4 copied to clipboard
Add a safe wrapper around `ptrEq`
We want to use pointer equality as a performance optimization. For example, as a first check in DecidableEq instances: if the two arguments are pointer-equal, then we can immediately return isTrue. However ptrEq is unsafe, and every time it is used via implementedBy we potentially introduce bugs. Therefore we want a safe wrapper around ptrEq. There are several issues to consider:
ptrEqis not compatible with equality.ptrEq a awill always return true, butptrEq a bmay return false even ifa = b. This is easy to address by either using a CPS-style definition or wrapping the result in a quotient likesafePtrEq (a b : α) : Squash Bool.- We need some information about the result of
ptrEqbeyond the fact that it is a Boolean, so that we can concludea = bifptrEq a breturns true. This is also straightforward to encode, e.g. withsafePtrEq (a b : α) : Squash { b : Bool // b → a = b }. - In general,
ptrEq a bdoes not implya = b.
Number (3) causes the withPtrEq function in core to be unsafe:
withPtrEq {α : Type u} (a b : α) (k : Unit → Bool) (h : a = b → k () = true) : Bool
The problem is that it is very easy to construct provably different Lean terms that evaluate to identical VM objects. In hindsight, this is an obvious effect of erasure and common subexpression elimination. The following are some concrete examples:
@[simp] theorem Bool_ne_Empty : Bool ≠ Empty :=
(Empty.rec <| · ▸ true)
-- 1) Types are erased, so ptrEq will always claim that they are equal:
def problem1 := withPtrEq Bool Empty (fun _ => false) (by simp)
#reduce problem1 -- false
#eval problem1 -- true
-- 2) This works even if we're not directly comparing types, but also when
-- they're nested in a heap value (due to CSE).
def problem2 := withPtrEq (Bool, 1) (Empty, 1) (fun _ => false) (by simp)
#reduce problem2 -- false
#eval problem2 -- true
Potential fixes:
- Cripple the compiler to never do CSE after erasure, and have
withPtrEqreturn false if one of the arguments isbox(0). (Only mentioned for completeness, I don't think this is a good idea.) - Add a type class
class InjectiveVMRepr (α : Sort u) : Prop where private mk ::, auto-generate instances for inductives and structures that are safe to compare withptrEq, and add an additional[InjectiveVMRepr α]argument towithPtrEq. Unfortunately it is not possible to useunsafefor such Rust-style "marker traits", because we need to proveNonempty (InjectiveVMRepr α)to define the safe instances. - Add a type class
SafePtrEqwith asafePtrEq (a b : α) : Squash { b : Bool // b → a = b }operation. This has the downside of increasing the compiled code size (due to monomorphization), and we can't define instances like[SafePtrEq α] → SafePtrEq (Array α).
Here is another potential fix:
- Do not inline
ptrEqat the initial compilation phases. Remark: in the new compiler stack, we are adding support for controlling in which phase a definition is inlined. - During code specialization, monomorphise
ptrEq. - When we introduce impure features (e.g.,
inc,dec, support forIOBase), we also processptrEq. In the new compiler stack, we are preserving type information, the information is sufficient for deciding whether the runtime representation is injective. If it is notptrEq a bis replaced withfalse. Otherwise, we inline it.
During code specialization, monomorphise
ptrEq.
I like this approach. It also helps with decidability instances for polymorphic types:
instance [DecidableEq α] : DecidableEq (List α) := ...
With the TC approach, we'd need an extra [SafePtrEq α] argument here so that the instance can be optimized when α has an injective VM representation.
The only downside is that monomorphization is limited to type-class arguments at the moment. So a function like def foo (a b : α) := ptrEq a b would not be monomorphized (even though that would be necessary to take advantage of the optimization).
The only downside is that monomorphization is limited to type-class arguments at the moment. So a function like def foo (a b : α) := ptrEq a b would not be monomorphized (even though that would be necessary to take advantage of the optimization).
This is true, but we are currently working on the new code generator, and we can add support for this feature.