nix
nix copied to clipboard
libfetchers/git: Allow Git Remote Helpers
Motivation
Use Nix with repositories that are (only) accessible via remote helpers, which are programs that are invoked by Git "under the hood" based on URL schemes.
This opens up lots of possibilities to interoperate with other storage/transport mechanisms.
Example remote helpers (by URL scheme, alphabetically):
codecommit: aws/git-remote-codecommitgitopa: see docs.gitopia.comgrave: rovaughn/git-remote-gravehg: glandium/git-cinnabaripfs: cryptix/git-remote-ipfs, dhappy/git-remote-ipfsipld: ipfs-shipyard/git-remote-ipldmango: axic/git-remote-mangorad: see radicle-dev/heartwood and radicle.xyz/guides/userrclone: datalad/git-remote-rclonessb: clehner/git-ssb
Context
The following two changes (both part of this PR) make use of such remote helpers possible in conjunction with Nix:
-
Relax URL scheme filtering for the Git fetcher, such that unknown URL schemes are not rejected directly. In case there is no corresponding remote helper available, the following output is produced:
warning: URL scheme 'git+invalid' is non-standard and requires 'git-remote-invalid'. git: 'remote-invalid' is not a git command. See 'git --help'. error: program 'git' failed with exit code 128 -
Add
GIT_DIRto the environment, since this is required by remote helpers.
As submitted (ee66ec343) this is a proof of concept, and works for me locally. I do understand that this feature might have to be protected by an experimental flag and docs would have to be added/changed. I am happy to do this if there's positive feedback.
Priorities and Process
Add :+1: to pull requests you find important.
The Nix maintainer team uses a GitHub project board to schedule and track reviews.
I want to raise the following question / thought: This is a quite general approach. Can it be expected that all remote helpers can be called without any additional command-line options? And if not, would they better be supplied when calling nix, i.e.,
nix ... --remote-helper-opts '--a-opt a-val' git+helper://...
# or
NIX_REMOTE_HELPER_OPTS=... nix git+helper://...
or via the URL to properly pair the options with the remote helper call (for nested fetches of flake inputs etc.)?
nix ... 'git+helper://...?opts=--a-opt%20a-val'
I want to raise the following question / thought: This is a quite general approach. Can it be expected that all remote helpers can be called without any additional command-line options?
Yes. Remote Helpers are invoked by Git and the Git reference documentation on Remote Helpers states:
Remote helper programs are invoked with one or (optionally) two arguments. The first argument specifies a remote repository as in Git; it is either the name of a configured remote or a URL. The second argument specifies a URL; it is usually of the form
<transport>://<address>, but any arbitrary string is possible.
Regarding
[...] or [would they better be supplied] via the URL to properly pair the options with the remote helper call (for nested fetches of flake inputs etc.)?
nix ... 'git+helper://...?opts=--a-opt%20a-val'
This is something that Remote Helpers can leverage without Git passing arguments. They control the URL scheme, so during git clone 'git+example://x?a-opt=a-val', the remote helper git-remote-example is invoked with the argument example://x?a-opt=a-val and can interpret the URL.
However, it will be tricky to get a URL with query string past Nix, since it also uses query strings to encode fetching information. With a file called git-remote-example in $PATH that just writes its $@ to a file, I was able to verify that
$ nix ... 'git+example://test?x=y&ref=main'
results in
example://test
and conversely
$ nix ... 'git+example://test%3Fx=y?ref=main'
results in
example://test%3Fx=y
So, Remote Helpers that do this, and require syntax that is otherwise parsed by Nix, will be incompatible. One workaround is the one you mentioned, somehow prefixing arguments that should not be parsed by Nix, e.g. ?pass-x=y. One loses the ability to just copy the URLs that the protocol of the remote helper requires (and maybe add ?rev or so).
IMO the conflation of fetching arguments and URL are the issue here, but that's Pandora's box. I think that there still are some interesting use cases. I am successfully using Remote Helpers with this patch.
I have no decision-oriented opinion on this, but a few notes:
-
There was movement to migrate away from shelling out to
gitand instead uselibgit2. So we should at least keep on the radar if we can port such a new feature tolibgit2. -
I like that using Git through Nix becomes more transparent and natural that way.
-
This introduces reproducibility risks, since now fetching would essentially require knowing the entire Nix setup including stuff from the environment. This change would lead to https://github.com/NixOS/nix/issues/3533 squared. And even if accepted, it would further move apart the user experience of vanilla Nix from Nix on NixOS, since you'd have to nail down more moving parts. I think we'd rather want expressions that will always work on a sufficiently modern Nix out of the box.
-
Lately, any work on fetchers for me raises the question if they should be part of Nix at all (apart from staying backward compatible). Tvix says no. Things like
npins(which usesnix-prefetch-urlandbuiltins.fetchurlunderneath) andgridlock(which does not depend on Nix at all) show that we can completely decouple obtaining sources from evaluating expressions. Decoupled fetching could be provided by the distribution of Nix -- one could argue, Nixpkgs is a Nix distribution, NixOS is a Nix distribution, even Flakes are in that sense a Nix distribution. -
Finally, what we of course should also optimise for is making Nix useful. But there are many ways to do that, such as finding and documenting convenient and scalable usage patterns. I've seen people who don't have a single
builtins.fetch*in their code and manage all remote sources throughgit subtree.
@fricklerhandwerk thanks a lot for thinking about this and providing your insights. Very valuable! 🙇🏻
I have no decision-oriented opinion on this, but a few notes:
- There was movement to migrate away from shelling out to
gitand instead uselibgit2. So we should at least keep on the radar if we can port such a new feature tolibgit2.
Right. libgit2 does not support remote helpers. It looks like it did give some trouble:
- Introduction (as you mentioned): https://github.com/NixOS/nix/pull/9240
- Partial revert: https://github.com/NixOS/nix/pull/9806
The revert was done for a somewhat similar reason: libgit2 also does not support credential helpers. And actually, now I realize that this PR goes in the exact opposite direction of
- #9807
This ties in quite directly to your point about moving fetchers out of Nix.
- I like that using Git through Nix becomes more transparent and natural that way.
I fail to understand. Which way exactly do you mean by "that way"?
- Lately, any work on fetchers for me raises the question if they should be part of Nix at all (apart from staying backward compatible). Tvix says no. Things like
npins(which usesnix-prefetch-urlandbuiltins.fetchurlunderneath) andgridlock(which does not depend on Nix at all) show that we can completely decouple obtaining sources from evaluating expressions.
Yeah, I agree they probably shouldn't. I would love to write my own fetcher that can be reused by various Nix implementations. That requires an interface and spec for fetchers. And of course it introduces a bootstrapping issue (as always): How to fetch fetchers? Quoting from the Git docs on Remote Helpers:
Git comes with a "curl" family of remote helpers, that handle various transport protocols, such as
git-remote-http,git-remote-https,git-remote-ftpandgit-remote-ftps. They implement the capabilitiesfetch,option, andpush.
But, not surprisingly, there already are fetchers in builtins that would have to be maintained for backward-compatibility anyway! With a nicer interface for "fetching fetchers", it might be feasible to remove the requirement that builtins.fetchgit supports credential helpers (this has obvious r13y issues anyway...). Instead, people that use credential helpers would plug in a fetcher that supports it. The "powerful" Git fetcher.
- Finally, what we of course should also optimise for is making Nix useful. But there are many ways to do that, such as finding and documenting convenient and scalable usage patterns. I've seen people who don't have a single
builtins.fetch*in their code and manage all remote sources throughgit subtree.
Yup, 💯. My motivation for this PR is that I wanted to have the nice UX of
$ nix run git+helper://project
where "project" is something that you cannot effectively fetch with plain Git, you need git-remote-help.
In Nix source files, or on the "repo structure" level, there's lots of room to plug in fetchers, but CLIs are much more constrained if you want something usable, i.e., that humans will ever be willing to type or copy and paste.
I like that using Git through Nix becomes more transparent and natural that way. I fail to understand. Which way exactly do you mean by "that way"?
The way proposed in this change.
My motivation for this PR is that I wanted to have the nice UX of
nix run git+helper://projectwhere "project" is something that you cannot effectively fetch with plain Git, you need git-remote-help.
Yes, this is a totally legitimate use case to me. It also hints at that the new CLI is architecturally a different beast from everything underlying it: you can't make the CLI more convenient in the straightforward way proposed in this PR without breaking what I deem essential properties of the lower layers. For me this reinforces the idea that fetching should be part of distributions (or at least par of the porcelain) rather than the plumbing.
@fricklerhandwerk It is ok for lower layers to bend to the requirements of the higher layers. What is the purpose of lower layers if not serving the highest layer, users? Of course we want to be careful not to make a mess, so I'm interested to know: what are the essential properties of the lower layers that you believe are violated?
In this discussion the lower layer is the Nix language, which contains built-in fetchers. IMO an essential property of Nix expressions is derivation-/store-object-level reproducibility: given an expression and some files, you should always get the same derivations/store objects, everywhere any time. (Yes, in practice there are heaps of caveats, but that's my personal ideal.) Nix expressions are a central user interface to Nix, and major consumers are Nixpkgs and NixOS.
This particular attempt at making the other user interface, the new CLI, more convenient, would degrade reproducibility of the language layer, because it adds space for a lot more moving parts. Therefore I claim these two UIs are architecturally separate, and would suggest to treat them as such. Your suggestion to add a conditional depending on who is calling, would only keep those concerns entangled for the benefit of a smaller diff, which I think would make for a long-term liability.
Going further, I also claim that fetching file system trees and computing with file system trees are also separate concerns with very different constraints, but right now they are entangled in the big "Nix language" component. For indication that this may indeed be true check the endless stream of issues and pull requests trying to make fetching behave more like curl and git clone, and how they compete with the aspiration to keep Nix expression evaluation hermetic.
An ideal architecture would produce data flow graphs like this, where the CLI offers means to control any part of the pipeline, and things like git helpers nicely find its place:
graph TD;
fetching[remote source] --> fso[file system object] --> expression --> derivation
CLI --> fetching & fso & expression & derivation
git-helpers -.-> fetching
As opposed to where we currently are, where expressions encode where to get stuff from, and where plugging git helpers will add degrees of freedom that make it hard to isolate expressions (I can't even draw the CLI interactions without making it a ball of yarn):
graph TD;
fetching[remote source] --> fso[file system object] --> expression --> derivation
git-helpers -.-> fetching & expression
expression --> fetching
@fricklerhandwerk Thank you for your elaboration.
I believe there's a lot of merit in Nix's ability to manage the fetching of expression files. For this purpose, we have to pick between
- built-in fetchers
- fairly benevolent import from derivation on fixed output derivations (deps will have been cached, almost always)
- submodules and/or user scripts, neither of which scale
While I wouldn't have minded to rely on IFD/FOD fetching until the built-in fetchers were actually good, that's not the path that was chosen.
For indication that this may indeed be true check the endless stream of issues and pull requests trying to make fetching behave more like
curlandgit clone,
This is a problem with the process that led to Nix's reproducibility being committed to the behavior of a program, git, that wasn't properly studied, presumably because people felt familiar with it, or there wasn't sufficient review, or they were pressured into delivering. I don't know the full history, but this has caused undue reputational damage to the feature, which is very unfortunate, because it is a very good feature when executed well, specifically when it comes to performance and usability (part standardization, no reliance on derivation system which may be unknowable, and probably more; haven't memorized it all).
We still have an opportunity to fix it. It is behind a feature flag, which we can lift when it behaves well. (Which is all we can do with it, fwiw)
It seems that you're not interested in developing fetchers. Perhaps you'd prefer to delegate this?
(Also I'm not sure if it's fair to "hijack" a PR to have a radical architectural conversation. How do we feel about this?)
Hey, has there been any further discussion about remote helpers / custom fetchers somewhere (since it was briefly mentioned during the conversation here)? I'm currently stuck in a situation where my use case requires that I have access to a feature like this, so I would like to offer my help in any shape or form which might be helpful to get a feature like this out the door. :)
This feature would be valuable for supporting modern Git workflows and alternative hosting solutions like Radicle. While I understand the desire to get the architecture right first, I wonder if we could take an incremental approach by adding remote helper support behind an experimental flag (similar to other Nix features). This would allow the community to experiment with and validate real-world use cases, while the longer-term architectural work continues.
My specific use case involves using private Radicle repositories as flake inputs in my NixOS configurations. Currently this does not work, but with remote helper support it could work seamlessly with the standard Nix tooling. This would make it easier to use Nix with newer decentralized source control solutions.