lean4 icon indicating copy to clipboard operation
lean4 copied to clipboard

Add a safe wrapper around `ptrEq`

Open gebner opened this issue 3 years ago • 3 comments

We want to use pointer equality as a performance optimization. For example, as a first check in DecidableEq instances: if the two arguments are pointer-equal, then we can immediately return isTrue. However ptrEq is unsafe, and every time it is used via implementedBy we potentially introduce bugs. Therefore we want a safe wrapper around ptrEq. There are several issues to consider:

  1. ptrEq is not compatible with equality. ptrEq a a will always return true, but ptrEq a b may return false even if a = b. This is easy to address by either using a CPS-style definition or wrapping the result in a quotient like safePtrEq (a b : α) : Squash Bool.
  2. We need some information about the result of ptrEq beyond the fact that it is a Boolean, so that we can conclude a = b if ptrEq a b returns true. This is also straightforward to encode, e.g. with safePtrEq (a b : α) : Squash { b : Bool // b → a = b }.
  3. In general, ptrEq a b does not imply a = b.

Number (3) causes the withPtrEq function in core to be unsafe:

withPtrEq {α : Type u} (a b : α) (k : Unit → Bool) (h : a = b → k () = true) : Bool

The problem is that it is very easy to construct provably different Lean terms that evaluate to identical VM objects. In hindsight, this is an obvious effect of erasure and common subexpression elimination. The following are some concrete examples:

@[simp] theorem Bool_ne_Empty : Bool ≠ Empty :=
  (Empty.rec <| · ▸ true)

-- 1) Types are erased, so ptrEq will always claim that they are equal:
def problem1 := withPtrEq Bool Empty (fun _ => false) (by simp)
#reduce problem1 -- false
#eval problem1 -- true

-- 2) This works even if we're not directly comparing types, but also when
-- they're nested in a heap value (due to CSE).
def problem2 := withPtrEq (Bool, 1) (Empty, 1) (fun _ => false) (by simp)
#reduce problem2 -- false
#eval problem2 -- true

Potential fixes:

  1. Cripple the compiler to never do CSE after erasure, and have withPtrEq return false if one of the arguments is box(0). (Only mentioned for completeness, I don't think this is a good idea.)
  2. Add a type class class InjectiveVMRepr (α : Sort u) : Prop where private mk ::, auto-generate instances for inductives and structures that are safe to compare with ptrEq, and add an additional [InjectiveVMRepr α] argument to withPtrEq. Unfortunately it is not possible to use unsafe for such Rust-style "marker traits", because we need to prove Nonempty (InjectiveVMRepr α) to define the safe instances.
  3. Add a type class SafePtrEq with a safePtrEq (a b : α) : Squash { b : Bool // b → a = b } operation. This has the downside of increasing the compiled code size (due to monomorphization), and we can't define instances like [SafePtrEq α] → SafePtrEq (Array α).

gebner avatar Aug 22 '22 19:08 gebner

Here is another potential fix:

  • Do not inline ptrEq at the initial compilation phases. Remark: in the new compiler stack, we are adding support for controlling in which phase a definition is inlined.
  • During code specialization, monomorphise ptrEq.
  • When we introduce impure features (e.g., inc, dec, support for IOBase), we also process ptrEq. In the new compiler stack, we are preserving type information, the information is sufficient for deciding whether the runtime representation is injective. If it is not ptrEq a b is replaced with false. Otherwise, we inline it.

leodemoura avatar Aug 27 '22 17:08 leodemoura

During code specialization, monomorphise ptrEq.

I like this approach. It also helps with decidability instances for polymorphic types:

instance [DecidableEq α] : DecidableEq (List α) := ...

With the TC approach, we'd need an extra [SafePtrEq α] argument here so that the instance can be optimized when α has an injective VM representation.

The only downside is that monomorphization is limited to type-class arguments at the moment. So a function like def foo (a b : α) := ptrEq a b would not be monomorphized (even though that would be necessary to take advantage of the optimization).

gebner avatar Aug 29 '22 11:08 gebner

The only downside is that monomorphization is limited to type-class arguments at the moment. So a function like def foo (a b : α) := ptrEq a b would not be monomorphized (even though that would be necessary to take advantage of the optimization).

This is true, but we are currently working on the new code generator, and we can add support for this feature.

leodemoura avatar Aug 29 '22 14:08 leodemoura