lean4 icon indicating copy to clipboard operation
lean4 copied to clipboard

feat: lake: `require` overhaul

Open tydeu opened this issue 7 months ago • 3 comments

This PR contains major changes to require DSL, the Lake manifest format, types of dependencies, and the how packages and targets are named. These changes overlap in significant ways, hence why there were all done as part of one PR. However, due to its complexity, it may be worthwhile to break significant parts of it up into smaller PRs. What follows is a summary of the major components of the current PR.

This PR depends on leanprover/reservoir#18 being merged (and its tests will need to be update once it is).

require Docstrings

Lake now supports docstrings on require commands:

/-- this is a docstring for a require statement -/
require std from ...

Closes #2898.

require ... if ... & Locked Conditional Dependencies

To specify conditional dependencies, the previous approach of, e.g.:

meta if get_config? doc = some "on" then
require «doc-gen4» from git "https://github.com/leanprover/doc-gen4" @ "main"

has been replaced with:

require «doc-gen4» if get_config? doc = some "on"
  from git "https://github.com/leanprover/doc-gen4" @ "main"

This replacement is backwards-compatible. Lake will intelligently expand meta if .. require ... to require ... if .. with a deprecation warning asking the user to update their configuration file accordingly.

The advantage of this change is that require statements are no longer conditionally elaborated. Instead, conditional dependencies are always part of a package's configuration. They just have settings indicating whether they are conditional and whether the dependency is enabled in a given configuration.

Lake makes use of the information to ensure the manifest always contains a complete record of all possible package dependencies locked to a specific version. That is, lake update will materialize and lock all dependencies -- even those not enabled. Other commands (e.g., lake build) will just materialize / use the packages enabled with the set configuration. This helps ensure reproducible builds across different configurations.

GitHub Dependency Source

Packages which are required from GitHub can now state this explicitly via new require syntax and a new source type rather than reusing the standard git source. For example,

require aesop from github "leanprover" / "aesop"

While Lake could deduce this from some URLs, making this explicit avoids heuristic errors and allows Lake to better provide GitHub-specific features. For instance, Lake now respects a LAKE_GITHUB_URL environment variable to point Lake to a different URL (other than https://github.com) from which to clone GitHub packages. This should help enable, for example, GitHub mirrors in restricted internet contexts and compatibility with Enterprise editions of GitHub.

Reservoir Registry Dependency

Adds a new registry package source which fetches package metadata from a API endpoint (e.g., Reservoir's) and clones a Git package using the information provided. To require such a dependency, the new syntax is:

require <scope> / <pkg-name> [@ <rev>] -- e.g., require "leanprover" / "doc-gen4"

Unlike with Git dependencies, Lake can make use of the richer information provided by the registry to determine the default branch of the package. This means for repositories of packages like std which have a default branch that is not master, Lake will now use said default branch (e.g., in std's case, main).

Also, as with GitHub dependencies, Lake supports configuring the registry endpoint via an environment variable: LAKE_RESERVIOR_URL. Thus, any server providing a similar interface to Reservoir can be used as the registry. Further configuration options paralleling those of Cargo's Alternative Registries and Source Replacement will come in future PRs.

Simple Package & Target Names

To implement Reservoir dependencies, an important detail was to have the package names specified in the require statement match those of the package being required. Previously, a mismatch was just warning, but experience has shown this should be a error (as things break otherwise). Reservoir gets its package names from the lake-manifest.json of the indexed package, thus it is important the name listed there matches and is easily machine-readable.

Before this PR, this name was naively produced via a toJSON on the package's name : Name field. However, this can produce names with escapes that are hard for non-Lean machines to read. These escapes can be removed via setting escape := false in Name.toString, but this makes the name not round-trip if it contains multiple components. To avoid this, Packages are now forced to have simple names (i.e., a Name with a single string component). For consistency's sake, this PR also changes targets (e.g., lean_lib, lean_exe, and target) to have simple names as well.

It is important note that this is a major API refactor with potential for breakage in complex configuration files. I have tried to minimized the places were breakage could occur, but they still exist. For instance, targets identified to by a multipart name literal will no longer exist and produce an error. Unfortunately, the simple FFI example in Lake used such a target:

extern_lib libleanffi pkg := do
  let name := nameToStaticLib "leanffi"
  let ffiO ←pkg.target ``ffi.o
  buildStaticLib (pkg.nativeLibDir / name) #[ffiO]

It needed to be changed to:

extern_lib libleanffi pkg := do
  let name := nameToStaticLib "leanffi"
  let ffiO ← ffi.o.fetch
  buildStaticLib (pkg.nativeLibDir / name) #[ffiO]

There are several such cases in the larger Lean ecosystem as well (e.g., loogle). This is likely because they followed the naming pattern of the example.

tydeu avatar Jan 13 '24 01:01 tydeu