dune icon indicating copy to clipboard operation
dune copied to clipboard

Automatic Package Repositories from Git

Open rgrinberg opened this issue 1 year ago • 2 comments

I got this idea thinking about a comment from @yawaramin https://github.com/ocaml/dune/issues/7680#issuecomment-1909013807.

Motivation

The idea is to introduce a lightweight construct that can replace managing opam repositories for users that are married to git. It isn't a replacement for opam repositories, pins or any similar construct. Just another tool for teams that are too small or too distributed to manage their own opam repository and are married to git.

Idea

We can discover opam packages just from a URL from a git repo using the following steps:

  1. Clone the repo
  2. Iterate over all the tags in the repo
  3. For every tag, enumerate all the packages in the repo

With the steps above, one can construct a mini opam repository for all the packages inside that git repo. Given that we are already ealing with opam repos from git, doing the above should be rather simple.

To enable the discovery above, a user would need to write the following stanza in their dune-project (or workspace) file:

(git_repo
 (packages *) ;; one could filter the packages with our predicate lang to exclude particular packages
 (url git+https://github.com/ocsigen/lwt))

Then, dependencies such as (lwt (= 5.0.0)) etc would resolve to the tags in the git_repo stanza.

Issues

Iterating over the metadata for every single tag in a repo could be slow given that we'll need to run at least two git commands for every single tag. It should be much more viable with libgit however.

cc @Lupus who also wanted a workflow that avoids an opam repository.

rgrinberg avatar Feb 02 '24 08:02 rgrinberg

Just to add here that the Go folks avoid the performance penalty by having a central proxy that caches all this info and having their client hit the proxy by default (unless the user turns that off eg for private repos). Anyway we see that some centralized info store becomes important for performance, just like cloud build caching.

yawaramin avatar Feb 02 '24 13:02 yawaramin

I love it! We currently publish packages to internal opam repository, which is just a git repo that points packages to, well, other git repos for actual sources with specific tags, which sounds like a lot of complications for no benefits... And I can't say that it works fast, opam update runs for ages while it keeps cloning all that stuff.

We are indeed a small team (like 5-10 devs), and have maybe several dozen libs/services that we have. Everything is on git anyways, just we have an extra step on our CI to do automated commit to another repo to publish and have to rotate the gitlab secret in our CI according to security team policies :D

Lupus avatar Feb 02 '24 15:02 Lupus