lean4
lean4 copied to clipboard
Proposal for Stdlib Contributions [RFC]
The current standard library in the Lean core is lacking many common utility functions. One way to go about fleshing it out is to allow the community to contributed useful utilities. However, such contributions would likely be small and thus not really suited for the current multi-step RFC contributing process. Instead, it would be nice to be able to fast track such small changes. To help make the process less unwieldy, we can provide examples of which contributions would be welcome and which would not. For example, we probably would want functions like List.nth
, but may wish to relegate syntactic extensions to the formal RFC process.
To give an idea of what we want to evolve Lean's standard library into, we can take cues from, for instance, F# and Haskell:
- [ ] F#'s Option
- [ ] Haskell's Maybe
- [ ] F#'s String
- [ ] Haskell's String
- [ ] F#'s List
- [ ] Haskell's List
- [ ] F#'s Map
- [ ] Haskel's Map
- [ ] F#'s Set
- [ ] Haskell's Set
- [ ] F#'s Array
Note: For the Haskell references, it should be observed that Haskell is (usually) lazy and Lean is strict, so some utilities will not translate.
@leodemoura, @Kha please tag this as help wanted
, if you're happy with it, so we can invite outside contributions on the lib.
I've helped with the RFC, am supportive of it, and it seems to align well with various internal conversations we had.
This is great, but we need to prepare for it. @Kha and I have been discussing a package reorganization.
We wait to avoid unnecessary dependencies when building Lean. One of the ideas we have discussed is to split Std
into StdCore
that is required to build Lean, ant Std
which would include the suggestions you have above.
@kha: How will the module system you are designing impact this RFC?
@tydeu: You floated the idea of putting most types and definitions at Init
in a namespace. We should decide whether we will make this change or not before we ask people to contribute code to the Std.
We should also add guidelines for theorems. I am assuming many will want to contribute theorems about the functions and types being defined at Std
.
@leodemoura, @Kha please tag this as help wanted, if you're happy with it, so we can invite outside contributions on the lib.
@DanielFabian I will add the tag as soon as we figure out the details described in my previous message.
@tydeu: You floated the idea of putting most types and definitions at
Init
in a namespace. We should decide whether we will make this change or not before we ask people to contribute code to the Std.
Since @leodemoura mentioned me, I will reaffirm that I do still very strongly support this idea. I am also willingly to make the changes necessary to do so should that be desired (though I can see that such a change might be too sensitive for someone less experienced, like myself, to make).
We should also add guidelines for theorems. I am assuming many will want to contribute theorems about the functions and types being defined at
Std
.
This raises an interesting question. Do we want complex (non-rfl
) theorems to be collocated with the functions themselves? Or should proofs be in a separate package? I imagine that many of the more complex theorems will require some heavy duty math and thus it might be more appropriate to relegate them to mathlib rather than include them in the stdlib.
@Kha: How will the module system you are designing impact this RFC?
I'm not sure it should impact it at all. Certainly partitioning modules into packages should be mostly independent of it, that is a different organization layer. Where to put proofs as well, as long as we keep a non-signature import around (which I'm relatively sure we'll want to), moving them into a separate module or even package should not be any harder than it is right now.
If we really want to fine-tune separate compilation of Lean itself, we may want a smaller prelude (i.e. default import). But as Gabriel said, separate compilation of the core lib doesn't really impact users since any change in the core will most likely change the Lean binary as well, so external Lean packages should be recompiled completely either way. So it might make more sense to keep the prelude roughly as it is right now and make use of explicit prelude
s even in say Lean/
where profitable, in which case we can make the dependencies as fine- or coarse-grained as we like.
@Kha and I have been discussing a package reorganization. We wait to avoid unnecessary dependencies when building Lean. One of the ideas we have discussed is to split Std into StdCore that is required to build Lean, ant Std which would include the suggestions you have above.
@Kha What about this one? If I remember correctly, the split was motivated by two things
- Lean compilation time. The build time would now depend only on
StdCore
- Bootstrapping issues when modifying Lean. We are still polishing Lean and making small modifications to the language and tactics. Some of these modifications require many steps and
update-stage0
operations. Right now, it is not too hard to change a tactic since they are not used to build Lean itself.
this feels like fairly orthogonal. How are tactics related to List.nth
or bimap
? Those kinds of trivial functions that every functional language wants to have really, really don't want to use tactics, but rather be carefully written tail-recursive functions that get compiled to simple loops.
I for one am not even convinced it's desirable to have proofs sit next to the basic functions. It might well make sense to have proofs about them sit in a different namespace or package.
Also, how could a bimap
or List.nth
take any time at all to compile? I'd expect the whole library of F# and Haskell modules outlined would be 2k lines tops and take about 2 seconds to compile altogether.
this feels like fairly orthogonal. How are tactics related to List.nth or bimap? Those kinds of trivial functions that every functional language wants to have really, really don't want to use tactics, but rather be carefully written tail-recursive functions that get compiled to simple loops.
It is natural to have theorems for functions, and theorems use tactics. Moreover, the tactic was just an example. When we improved match
, the code base had to be adjusted.
Some of the changes we have to do in Lean are super painful, and the pain is proportional to the amount of code needed to build Lean. We must not have unnecessary dependencies.
I for one am not even convinced it's desirable to have proofs sit next to the basic functions. It might well make sense to have proofs about them sit in a different namespace or package.
We can have them in different files, but it feels too drastic to use a different package.
I get the argument, but only half way. By having a richer library, you can express some idioms more concisely using a helper and if the code needs to change, it may well only have to change in the implementation of the helper as opposed to each call-site.
A classic example is a fold
. When you have a fold, you should not need to write so many recursive functions, thus fewer match
es, etc.
If a helper is not used at all in the code base, then the argument holds of course.
@DanielFabian We should continue the discussion in a face-to-face meeting. I am not managing to convey how painful the bootstrapping issues are here, and why we must minimize the number of dependencies we need to build the Lean binary. It is taking to much time to make the point here in asynchronous media.
I would recommend you spend some time "designing" your "namespaces" first, then this provides a framework for stuff to be filled in. Then the "standard library" needs some really clear "design principles" to help decide what goes in and what doesn't. Then this all needs to be posted someplace so we can all see it.