lean4 icon indicating copy to clipboard operation
lean4 copied to clipboard

fix: remove special handling of numerals in `DiscrTree`

Open timotree3 opened this issue 1 year ago • 10 comments

Fixes https://github.com/leanprover/lean4/issues/2867 by removing special handling of OfNat.ofNat numerals from DiscrTree.

The special handling was added in 2b8e55c2f19e6599128bc7e57cb7221df5693b4e to fix a bug represented in the test tests/lean/run/bugNatLitDiscrTree.lean, but with https://github.com/leanprover/lean4/issues/2916 fixed, that test case passes without the special handling.

Some special handling of Nat.zero and Nat.succ remains to ensure that keys continue to match when unification replaces constants of the form Nat.zero...succ.succ with numerals.

One notable implication of the removal of this special handling is that it means that instances of OfNat MyType n for specific natural numbers n should be written as OfNat MyType (nat_lit n). See tests/run/rewrites.lean for an example of this modification being needed.

Since this PR changes the persisted discrimination tree format, an update stage0 commit is necessary to make tests pass. (Otherwise oleans produced by the stage0 compiler with old discrimination tree data are used alongside the new discrimination tree algorithm.)

timotree3 avatar Mar 14 '24 17:03 timotree3

Mathlib CI status (docs):

  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase e61d082a9555464bca6de954f8fbc2a1b1a1bf89 --onto 317adf42e92a4dbe07295bc1f0429b61bb8079fe. (2024-03-14 19:58:14)
  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 4e3a8468c351531235ba5cffd5566b7283b7a8ca --onto 317adf42e92a4dbe07295bc1f0429b61bb8079fe. (2024-03-14 20:37:41)

Blocked by #3686 which would fix the only remaining failing test.

timotree3 avatar Mar 14 '24 21:03 timotree3

awaiting-review

timotree3 avatar Apr 05 '24 04:04 timotree3

@leodemoura I would love if you would review this PR when you get the chance. I think you are effectively the code owner for the discrimination tree.

timotree3 avatar Apr 08 '24 20:04 timotree3

One notable implication of the removal of this special handling is that it means that instances of OfNat MyType n for specific natural numbers n should be written as OfNat MyType (nat_lit n). See tests/run/rewrites.lean for an example of this modification being needed.

This seems like a significant usability regression from my perspective. If failing to decorate a term in one spot with a special modifier leads to various tools degrading in their performance or not working well, but doesn't throw an error, then I suspect we'll mostly end up with various tools degrading in their performance.

Is there a way to fix this elsewhere?

david-christiansen avatar Apr 09 '24 03:04 david-christiansen

This seems like a significant usability regression from my perspective.

You're right. Getting an error message failed to synthesize instance OfNat MyType 0 when you have an instance of OfNat MyType 0 (but not OfNat MyType (nat_lit 0)) would be a very confusing user experience. If we did think it was preferable from a language design point of view for the two spellings of this instance to have differing precise semantics, then we should at least improve this error message. One easy way would be to set pp.natLit true by default so that the error message is failed to synthesize instance OfNat MyType (nat_lit 0) instead.

timotree3 avatar Apr 09 '24 18:04 timotree3

Upon further reflection, I am realizing that the following might be a helpful guiding principle for discrimination tree behavior:

If e1 and e2 can be unified at reducible transparency, then queries at key e2 should return values inserted at key e1 in a discrimination tree.

Given this specification, let's evaluate the status quo and this PR on three key examples:

/-! For each reducible defeq, test that instances for the LHS apply to the RHS -/

class MyClass (n : Nat) : Prop where

def myClass_zero : MyClass 0 where
def myClass_ofNat (n : Nat) : MyClass (OfNat.ofNat n) where

#check show 0 = nat_lit 0 by with_reducible rfl -- OK (reducibly defeq)

attribute [local instance] myClass_zero in
#synth MyClass (nat_lit 0)
-- Status quo: OK (instance applied)
-- This PR: ERROR (instance didn't apply)

#check show OfNat.ofNat _ = 0 by with_reducible rfl -- OK (reducibly defeq)

attribute [local instance] myClass_ofNat in
#synth MyClass 0
-- Status quo: ERROR (instance didn't apply)
-- This PR: OK (instance applied)

#check show OfNat.ofNat _ = nat_lit 0 by with_reducible rfl -- ERROR (not reducibly defeq)

attribute [local instance] myClass_ofNat in
#synth MyClass (nat_lit 0)
-- Status quo: ERROR (instance didn't apply)
-- This PR: ERROR (instance didn't apply)

As you can see, the key change with this PR is that the discrimination tree now fails to honor the reducible defeq 0 = nat_lit 0, but successfully honors the reducible defeq OfNat.ofNat _ = 0.

I've included the third example to highlight that the current behavior of reducible defeq is actually non-transitive: We have OfNat.ofNat _ = 0 and 0 = nat_lit 0, but not OfNat.ofNat _ = nat_lit 0. This non-transitivity is rather directly the cause of the existing bug: The discrimination tree eagerly normalizes by the equality 0 = nat_lit 0, and then fails to unify OfNat.ofNat _ with nat_lit 0. Fixing the behavior of reducible defeq will naturally determine the correct fix for the discrimination tree.

I think there are two reasonable approaches to fixing this:

  • Demote OfNat.ofNat to a regular old semireducible function.
    • This means embracing the distinction between 0 (AKA OfNat.ofNat (nat_lit 0)) and nat_lit 0 and removing the reducible defeq 0 = nat_lit 0.
  • Embrace OfNat.ofNat as a reducible function and make @OfNat.ofNat Nat x (instOfNatNat _) = x a reducible defeq.

This PR would move the discrimination tree a step towards the former approach of demoting OfNat.ofNat to a normal function, although it does not finish the job by actually updating the behavior of reducible defeq.

Now that we're discussing it though, I think the latter approach might have some serious benefits. Namely, it would mean that you (almost?) never have to mentally distinguish between nat_lit n and n as a user, effectively demoting nat_lit to an implementation detail rather than a language feature.

What do you think @david-christiansen?

timotree3 avatar Apr 09 '24 20:04 timotree3

I don't feel like I have enough context to evaluate the technical trade-offs of the concrete proposals - sorry!

But I do know that explaining where and why nat_lit was required was extremely difficult while working on Functional Programming in Lean, and this actually led to it not being mandatory when writing instances. I would really like to maintain the property that users don't have to think about it, treating it (as you said) as an internal implementation detail.

david-christiansen avatar Apr 11 '24 16:04 david-christiansen

(and thank you for the thorough consideration!)

david-christiansen avatar Apr 11 '24 16:04 david-christiansen

Without implying that it is sufficient, a Mathlib adaptation branch would be necessary for progress here.

kim-em avatar Jul 17 '24 02:07 kim-em