lean4 icon indicating copy to clipboard operation
lean4 copied to clipboard

feat: reserved names

Open leodemoura opened this issue 11 months ago • 1 comments

  • Add support for reserved declaration names. We use them for theorems generated on demand.
  • Equation theorems are not private declarations anymore.
  • Generate equation theorems on demand when resolving symbols.
  • Prevent users from creating declarations using reserved names. Users can bypass it using meta-programming.

See next test for examples.

leodemoura avatar Mar 14 '24 04:03 leodemoura

Mathlib CI status (docs):

  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 995726f75fd7eed223c2189a54f6df3293185327 --onto 317adf42e92a4dbe07295bc1f0429b61bb8079fe. (2024-03-14 04:38:47)
  • ❗ Std/Mathlib CI will not be attempted unless your PR branches off the nightly-with-mathlib branch. Try git rebase 4e3a8468c351531235ba5cffd5566b7283b7a8ca --onto 317adf42e92a4dbe07295bc1f0429b61bb8079fe. (2024-03-15 00:30:33)

Could the behavior of not being able to declare definitions with names like foo.induct be controllable by an option? Even if it is possible to do with metaprogramming, metaprograms still frequently make use of the user-level def command in order to create declarations. Plus, this is a major breaking change for libraries that happen to have names like this in their public API, and they should have a way to avoid having to break their own users.

digama0 avatar Mar 14 '24 08:03 digama0

If it wouldn't be for the .eq1 theorems, where we don’t know how many we have until we have created them, I’d lean towards having an explicit set of reserved names that is extended at at the time the base name is created. so

  • before def fib, fib.def and fix.induct are plain names
  • with def fib, they are added to the set of reserved names, and an error (or warning) is shown if these names already exist
  • afterwards, using these names runs the lazy action.

This way the user experience is much closer to what it would be if the definitions were simply created eagerly, and it avoids squatting useful names like Foo.induct and Foo.def entirely.

I just don’t have a great idea how to extend that to .eq_1. Maybe this “set of reserved names” needs has to support patterns like .eq_<n> so that all of them can be reserved.

nomeata avatar Mar 14 '24 08:03 nomeata

This is different from adding a ”thunked declaration” to the environment at precisely the time when, from the users point of view, the name did come into existence, in which case there is no squatting in every namespace.

@nomeata I considered this option. It will dramatically increase the number of symbols in the Environment. Startup time will be affected.

leodemoura avatar Mar 14 '24 14:03 leodemoura

But I see the problems: One does not know the number of equation lemmas ahead of times, and maybe it's tricky to serialize a CoreM () thunk, so maybe this isn't an option.

@nomeata Yes, this is another issue with the thunk approach.

leodemoura avatar Mar 14 '24 14:03 leodemoura

@digama0

Could the behavior of not being able to declare definitions with names like foo.induct be controllable by an option?

It is possible as a backward compatibility option.

Plus, this is a major breaking change for libraries that happen to have names like this in their public API, and they should have a way to avoid having to break their own users.

It seems only the def reserved name creates problems in practice. Should we use a different name for it?

leodemoura avatar Mar 14 '24 14:03 leodemoura

before def fib, fib.def and fix.induct are plain names

We can do it.

with def fib, they are added to the set of reserved names, and an error (or warning) is shown if these names already exist

Yes, we can run the extra check at declaration time.

leodemoura avatar Mar 14 '24 14:03 leodemoura

before def fib, fib.def and fix.induct are plain names

We can do it.

Ah, I didn’t quite catch the first time that isReservedName gets to peek at the Environment so foo.def is only reserved if there exists foo. That's better indeed.

If we really want to avoid .def we can use .unfold or .equation (.eq is probably also useful otherwise). Not sure if it’s worth it.

And maybe it’s actually good to squat the .def (or whatever it will be) and .induct namespaces: This way the user’s expectation that foo.def unfolds foo will hold, and .induct is “the” automatically generated induction principle – if an API wants to provide another one, it’s maybe even a good thing that they have to use a different name (foo.strong_induct or whatever), this signals to the user that something special is happening here.

nomeata avatar Mar 14 '24 15:03 nomeata

@nomeata Please take a look at reserved.lean, the last few commits implemented a few improvements.

leodemoura avatar Mar 14 '24 15:03 leodemoura

Thanks, I think that's a nice approach now.

What is your plan for error reporting? Can the delayed action log messages and/or throw exceptions, and will they be passes to the user in a helpful way? Is it the responsibility of the action to prepend the message with something like “Failed to create foo.induct:”, or is that something that'd be handled by the framework here?

nomeata avatar Mar 14 '24 16:03 nomeata

What is your plan for error reporting? Can the delayed action log messages and/or throw exceptions, and will they be passes to the user in a helpful way? Is it the responsibility of the action to prepend the message with something like “Failed to create foo.induct:”, or is that something that'd be handled by the framework here?

@nomeata Right now it is marked as a TODO in the code. I still need to try a few experiments.

leodemoura avatar Mar 14 '24 17:03 leodemoura

I was struggling with this in Aesop, so I love this PR! Once it lands, will there be an expectation that tactics may realise declarations with reserved names, but may not otherwise add new declarations? And will there be a way to efficiently determine which declarations were realised by a tactic?

JLimperg avatar Mar 14 '24 18:03 JLimperg

Once it lands, will there be an expectation that tactics may realise declarations with reserved names, but may not otherwise add new declarations?

It sounds reasonable, but we are not enforcing this expectation.

And will there be a way to efficiently determine which declarations were realised by a tactic?

We are currently not tracking this information, but we could track all names realized so far in an environment extension.

leodemoura avatar Mar 14 '24 19:03 leodemoura

Once it lands, will there be an expectation that tactics may realise declarations with reserved names, but may not otherwise add new declarations?

It sounds reasonable, but we are not enforcing this expectation.

Okay. I'll require that Aesop rules satisfy this expectation and that's going to be fine in practice.

And will there be a way to efficiently determine which declarations were realised by a tactic?

We are currently not tracking this information, but we could track all names realized so far in an environment extension.

This would be useful. To check whether an Aesop rule modified the environment, I currently have to compare the environment's map2's before and after the rule. With the realised name tracking, I could make this check less wasteful and more elegant. Thanks!

JLimperg avatar Mar 15 '24 00:03 JLimperg

@leodemoura

It seems only the def reserved name creates problems in practice. Should we use a different name for it?

Would def_eq be appropriate? Having snake_case names for the reserved identifiers at least ensures they will not overlap with non-theorem user names by convention.

tydeu avatar Mar 16 '24 15:03 tydeu

Having snake_case names for the reserved identifiers at least ensures they will not overlap with non-theorem user names by convention.

That doesn't really help in this case since the theorem in question matches the naming convention (it is a theorem). That's not a bad thing insofar as we want the things lean generates to not be too different, but we also don't want to name-squat things the user wants to use or is already using.

digama0 avatar Mar 17 '24 02:03 digama0

@digama0 Agreed. The idea was that a single world reserved identifier like def (or induct) clash with both potential theorem and non-theorem user names, whereas a multi-world name only clashes with user names of the relevant kind. Thus, when reversing words, using a multi-word name seems preferable as it decreases the chance of a clash.

tydeu avatar Mar 17 '24 02:03 tydeu