lean4 Proposal for Stdlib Contributions [RFC]

The current standard library in the Lean core is lacking many common utility functions. One way to go about fleshing it out is to allow the community to contributed useful utilities. However, such contributions would likely be small and thus not really suited for the current multi-step RFC contributing process. Instead, it would be nice to be able to fast track such small changes. To help make the process less unwieldy, we can provide examples of which contributions would be welcome and which would not. For example, we probably would want functions like List.nth, but may wish to relegate syntactic extensions to the formal RFC process.

To give an idea of what we want to evolve Lean's standard library into, we can take cues from, for instance, F# and Haskell:

Note: For the Haskell references, it should be observed that Haskell is (usually) lazy and Lean is strict, so some utilities will not translate.

Jun 13 '21 08:06 tydeu

@leodemoura, @Kha please tag this as help wanted, if you're happy with it, so we can invite outside contributions on the lib.

I've helped with the RFC, am supportive of it, and it seems to align well with various internal conversations we had.

Jun 13 '21 08:06 DanielFabian

This is great, but we need to prepare for it. @Kha and I have been discussing a package reorganization. We wait to avoid unnecessary dependencies when building Lean. One of the ideas we have discussed is to split Std into StdCore that is required to build Lean, ant Std which would include the suggestions you have above.

@kha: How will the module system you are designing impact this RFC?

@tydeu: You floated the idea of putting most types and definitions at Init in a namespace. We should decide whether we will make this change or not before we ask people to contribute code to the Std.

We should also add guidelines for theorems. I am assuming many will want to contribute theorems about the functions and types being defined at Std.

Jun 13 '21 21:06 leodemoura

@leodemoura, @Kha please tag this as help wanted, if you're happy with it, so we can invite outside contributions on the lib.

@DanielFabian I will add the tag as soon as we figure out the details described in my previous message.

Jun 13 '21 21:06 leodemoura

@tydeu: You floated the idea of putting most types and definitions at Init in a namespace. We should decide whether we will make this change or not before we ask people to contribute code to the Std.

Since @leodemoura mentioned me, I will reaffirm that I do still very strongly support this idea. I am also willingly to make the changes necessary to do so should that be desired (though I can see that such a change might be too sensitive for someone less experienced, like myself, to make).

Jun 14 '21 04:06 tydeu

We should also add guidelines for theorems. I am assuming many will want to contribute theorems about the functions and types being defined at Std.

This raises an interesting question. Do we want complex (non-rfl) theorems to be collocated with the functions themselves? Or should proofs be in a separate package? I imagine that many of the more complex theorems will require some heavy duty math and thus it might be more appropriate to relegate them to mathlib rather than include them in the stdlib.

Jun 14 '21 04:06 tydeu

@Kha: How will the module system you are designing impact this RFC?

I'm not sure it should impact it at all. Certainly partitioning modules into packages should be mostly independent of it, that is a different organization layer. Where to put proofs as well, as long as we keep a non-signature import around (which I'm relatively sure we'll want to), moving them into a separate module or even package should not be any harder than it is right now.

If we really want to fine-tune separate compilation of Lean itself, we may want a smaller prelude (i.e. default import). But as Gabriel said, separate compilation of the core lib doesn't really impact users since any change in the core will most likely change the Lean binary as well, so external Lean packages should be recompiled completely either way. So it might make more sense to keep the prelude roughly as it is right now and make use of explicit preludes even in say Lean/ where profitable, in which case we can make the dependencies as fine- or coarse-grained as we like.

Jun 14 '21 09:06 Kha

@Kha and I have been discussing a package reorganization. We wait to avoid unnecessary dependencies when building Lean. One of the ideas we have discussed is to split Std into StdCore that is required to build Lean, ant Std which would include the suggestions you have above.

@Kha What about this one? If I remember correctly, the split was motivated by two things

Lean compilation time. The build time would now depend only on StdCore
Bootstrapping issues when modifying Lean. We are still polishing Lean and making small modifications to the language and tactics. Some of these modifications require many steps and update-stage0 operations. Right now, it is not too hard to change a tactic since they are not used to build Lean itself.

Jun 14 '21 15:06 leodemoura

this feels like fairly orthogonal. How are tactics related to List.nth or bimap? Those kinds of trivial functions that every functional language wants to have really, really don't want to use tactics, but rather be carefully written tail-recursive functions that get compiled to simple loops.

I for one am not even convinced it's desirable to have proofs sit next to the basic functions. It might well make sense to have proofs about them sit in a different namespace or package.

Also, how could a bimap or List.nth take any time at all to compile? I'd expect the whole library of F# and Haskell modules outlined would be 2k lines tops and take about 2 seconds to compile altogether.

Jun 14 '21 16:06 DanielFabian

this feels like fairly orthogonal. How are tactics related to List.nth or bimap? Those kinds of trivial functions that every functional language wants to have really, really don't want to use tactics, but rather be carefully written tail-recursive functions that get compiled to simple loops.

It is natural to have theorems for functions, and theorems use tactics. Moreover, the tactic was just an example. When we improved match, the code base had to be adjusted. Some of the changes we have to do in Lean are super painful, and the pain is proportional to the amount of code needed to build Lean. We must not have unnecessary dependencies.

I for one am not even convinced it's desirable to have proofs sit next to the basic functions. It might well make sense to have proofs about them sit in a different namespace or package.

We can have them in different files, but it feels too drastic to use a different package.

Jun 14 '21 16:06 leodemoura

I get the argument, but only half way. By having a richer library, you can express some idioms more concisely using a helper and if the code needs to change, it may well only have to change in the implementation of the helper as opposed to each call-site.

A classic example is a fold. When you have a fold, you should not need to write so many recursive functions, thus fewer matches, etc.

If a helper is not used at all in the code base, then the argument holds of course.

Jun 14 '21 16:06 DanielFabian

@DanielFabian We should continue the discussion in a face-to-face meeting. I am not managing to convey how painful the bootstrapping issues are here, and why we must minimize the number of dependencies we need to build the Lean binary. It is taking to much time to make the point here in asynchronous media.

Jun 14 '21 16:06 leodemoura

I would recommend you spend some time "designing" your "namespaces" first, then this provides a framework for stuff to be filled in. Then the "standard library" needs some really clear "design principles" to help decide what goes in and what doesn't. Then this all needs to be posted someplace so we can all see it.

Jan 20 '22 19:01 lovettchris

lean4 lean4 copied to clipboard

Proposal for Stdlib Contributions [RFC]

lean4
lean4 copied to clipboard