rfcs
rfcs copied to clipboard
[RFC 0171] Default name of fetchFromGithub FOD to include revision
Make fetchFromGithub FOD name more meaningful. This avoids stale artifacts and gives more content fidelity when looking at nix store paths.
Rendered: https://github.com/jonringer/rfcs/blob/jringer/fetch-from-github/rfcs/0171-fetch-from-github.md
Items for further refinement:
- Other version control fetchers
- Prefix vs suffix of
srclabel
Just one silly question: why only fetchFromGitHub but not including fetchFromGitLab and others, besides the reason that it is very common?
Although fetchFromGithub is one of many fetchers; it is very common, and generally has a user specify granular source information which makes differentiating between sources easy.
Just one silly question: why only
fetchFromGitHubbut not includingfetchFromGitLaband others, besides the reason that it is very common?
I did reference this https://github.com/NixOS/rfcs/blob/1ac472d4f0b82ec9f6781cd8c5a4f88fabef631e/rfcs/0171-fetch-from-github.md?plain=1#L104
I mention fetchFromGitHub because it's about 20x as common:
$ rg -i fetchFromGitHub | wc -l
33178
$ rg -i fetchFromGitlab | wc -l
1478
I'm not opposed to doing a similar treatment to the other fetchFromX helpers. I think this RFC could be expanded to encompass them all eventually.
To your point, maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"
There's already a speculative impl as of 2 weeks ago: https://github.com/NixOS/nixpkgs/pull/294068
Though the format is slightly different ("source-${owner}-${repo}-${rev}", essentially)
Though the format is slightly different (
"source-${owner}-${repo}-${rev}", essentially)
I choose the format to make it easier to inspect which source FOD is going wrong when several packages got rebuild at the same time.
Update: Just came across NixOS/nixpkgs#49862, which seems to work on the package names of fetchers.
maybe this should be relabeled as "Default name of fetchFromX FOD helpers to include revision"
IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).
Hehe per https://github.com/NixOS/rfcs/pull/133#issuecomment-2016614005, my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!
Hehe per https://github.com/NixOS/rfcs/pull/133#issuecomment-2016614005, my dream of single hash git fetchers (avoiding the need to put things in the name like this) is now a good bit closer!
Congratulations!
We still need to fix those fetchers at Nixpkgs level, though, since Nixpkgs often takes years to adopt new Nix features.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfcsc-meeting-2024-04-02/42643/1
I'd like to nominate myself as a shepherd. I previously authored a related implementation, NixOS/nixpkgs#294068, which could inform the discussion around this RFC's design.
I'm excited about this RFC and look forward to working with the community to shepherd it through the process.
IMO, "fetchers for version-controlled repositories" would be a suitable target. This includes fetchers based on specific version control systems (e.g. fetchcvs, fetchsvn, fetchgit, fetchhg, fetchbzr, etc.) and fetchers based on specific service providers (e.g. fetchFrom*).
If we have a shepards meeting, we can refine the scope.
Yes, https://github.com/NixOS/nixpkgs/pull/49862 is very related, in its latest reincarnation it allows you to keep both *-source names on Hydra and generate pretty *-<name>-<version>-<optional fetcher>-source names with config.nameSourcesPrettily option set. See https://github.com/NixOS/nixpkgs/pull/49862/files#diff-1977c7748af8b43b92093d2383e23e88c4df10a702578fbe885675a0956a2f1f
To code the fetcher name into the name attribute, the default name could also be ${repo}-${rev}-github-source.
People who use the result FOD produced by fetchFromGitHub don't even need to know about what name is. They'll just use src.name to set sourceRoot and use sourceRoot in their build commands.
As an enforcible specification, maybe we don't even need to specify what name should default to in the RFC level, but require the default to contain either version or source revision.
(Whether/how long the revision could be truncated are still in this scope, since it affects the probabity of colisions.)
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfc-171-default-name-of-fetchfromx-to-include-version-information/43001/1
This RFC has not acquired enough shepherds. This typically shows lack of interest from the community. In order to progress a full shepherd team is required. Consider trying to raise interest by posting in Discourse, talking in Matrix or reaching out to people that you know.
If not enough shepherds can be found in the next month we will close this RFC until we can find enough interested participants. The PR can be reopened at any time if more shepherd nominations are made.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfcsc-meeting-2024-04-16/43512/1
I'll toss my name in for an RFC shepherd, as I believe in the utility of this proposal.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfc-171-default-name-of-fetchfromx-to-include-version-information/43001/2
@oxij I would like to nominate you as a shepherd, looks like you've thought about this longer than I've been involved in nixpkgs.
I like your addition of a lib function to make derivation names more consistent in https://github.com/NixOS/nixpkgs/pull/49862
Thanks for a nomination, but I don't think this RFC is a proper solution to the stated problem.
Eelco and others objecting to the <hash>-<name>-<version or revision>-source naming scheme (and to the original 2018 version of #49862 which wanted to do exactly this, but for all fetchers, not just fetchFromGitHub, and, primarily, for a different reason of /nix/store discoverability) do have a point: the current most common <hash>-source naming scheme is awfully convenient for running a build server (like Hydra) and when you have to frequently switch between fetchers (e.g. when using out-of-Nixpkgs derivations and/or flakes).
IMHO, the problem this RFC tries to address is actually a set of features/bugs/issues of Nix itself:
-
Nix reuses existing fixed-output outputs between different derivations without actually checking the derivation actually builds into an output with the given output hash; which is certainly a feature, given how useful it is for switching between different fetchers;
-
but, IIRC, last time I tried, explicit
nix-store -r --check <derivation>over such a derivation will fail to re-build and re-check it; which is either a bug, or there should be a separate option for doing this; for simplicity, let's assume we want a new option for this, and we want to call it--force-rebuild; -
nix-buildandnix buildlack--force-rebuildoption (similarly to how they lack--checkoption too), you have to callnix-store -rmanually (an issue).
(Now that I think about it, I should probably open Nix issues for these. Later, I'm a bit busy ATM.)
If the latter two issues were not such, then you could just do nix-build --force-rebuild -A hello.src instead of any of this, and OfBorg could also run that on all changed fixed-output derivations in a PR, ensuring the author did not screw it up regardless of any (non-)changes to derivation names.
Which would be a better solution to the stated problem, since this RFC will fail to solve it for derivations with custom names anyway.
In the meantime, IMHO, since changes like #245388, #247977, #248528, #248683 are generally useful and get merged without objections, I think you should apply #49862 to your Nixpkgs, split yet-unfixed changes to sourceRoots from your #153386 into a separate patchset (I think I covered most of your changes with my PRs, but I probably missed at least some of those changes), rebuild your system with #49862 applied and nameSourcesPrettily = true or nameSourcesPrettily = full set, and PR any changes you need to make it all work.
Posting a nice review in #49862 so that it would get a chance to actually get merged eventually would be nice too, I suppose.
Then, if nobody ever wants to work on the Nix issues described above, and #49862 gets merged, we could discuss setting nameSourcesPrettily = "full" for OfBorg as a workaround.
And, IMHO, it should be "full", not true (which is essentially what this RFC proposes) so that changing between fetchgit, fetchFromGitHub, fetchFromGitLab, etc would force a recheck too.
I certainly think this conversation needs to be revived because of the security implications of
Nix reuses existing fixed-output outputs between different derivations without actually checking the derivation actually builds into an output with the given output hash
which leaves us open to a potential cache poisoning attack on cache.nixos.org, related to https://github.com/NixOS/ofborg/issues/68#issuecomment-2085327052 - using the same name for a large proportion of our FODs significantly exacerbates this.
@risicle You are correct, it does have security implications, but even doing the equivalent of nameSourcesPrettily = "full" of https://github.com/NixOS/nixpkgs/pull/49862 wouldn't really help in the most general case: @tobiasBora in https://github.com/NixOS/ofborg/issues/68#issuecomment-2083351493 shows how to easily poison OfBorg's cache.
Obviously, it won't work for poisoning Hydra, since the evil PR of his method won't ever get merged.
But now thinking about this, I have an IMHO much simpler and thus scarier attack that will work against Hydra:
- Malice makes an evil PR with a package update that changes the source's revision to a new version but points the hash to a hash of an older version of the same package, a version which has a well-known CVE.
- A very paranoid Nixpkgs maintainer Alice (who does not use any build caches) looks at it, it looks fine, so she builds it, and it works fine, since Alice has the older version of the source in
/nix/store, and so she merges it. - All tools that track unfixed CVEs in Nixpkgs only look at package versions, so those will be fooled too.
- Malice can now exploit the CVE.
In fact, if that source derivation also has a custom name (which is not uncommon), then nothing will catch this (including me, with all my machines not using the Hydra cache and all using the equivalent of nameSourcesPrettily = "full" since 2018).
A scary thought.
So, I guess https://github.com/NixOS/nix/issues/969 is actually a critical bug, then.
You're close to the scenario detailed by that author to the security team.
Ultimately we can never completely stop a malicious change missing the attention of a reviewer (and this goes for all projects, hashed caching scheme or not), but we can make it so the submitter has to jump through some more hoops that will hopefully make the PR look weirder and provoke more attention. Hence small ("small") changes like this.
Eelco and others objecting to the
<hash>-<name>-<version or revision>-sourcenaming scheme (and to the original 2018 version of NixOS/nixpkgs#49862 which wanted to do exactly this, but for all fetchers, not justfetchFromGitHub, and, primarily, for a different reason of/nix/storediscoverability) do have a point: the current most common<hash>-sourcenaming scheme is awfully convenient for running a build server (like Hydra) and when you have to frequently switch between fetchers (e.g. when using out-of-Nixpkgs derivations and/or flakes).
The objection(https://github.com/NixOS/nixpkgs/pull/49862#issuecomment-436585365 and https://github.com/NixOS/nixpkgs/pull/59858#issuecomment-484896229) is no longer relevant, as sourceRoot = "source" is officially deprecated since Nixpkgs 23.11 when an FOD returned by a fetchzip-based build helper is assigned to src. At least five rounds of clean-up PRs has been merged to eliminate the sourceRoot = "source" assumption.
I still think these are two different points:
-
Changing most source derivation names no longer breaks most builds, yes. (Some do depend on customly named source derivations still, AFAIK. E.g. Apache Arrow, Datafusion, etc.)
(Also, as the author of both ccbb065c88080966e14872ea9e0ac577f753d902 and 775f21b9fdc1f2bb46518a575339fac8ff867959 in Nixpkgs I feel it kind of a conflict of interest to push for this for those reasons.)
-
But moving away from
*-sourcenaming scheme will still make using flakes inconvenient.
- Changing most source derivation names no longer breaks most builds, yes. (Some do depend on customly named source derivations still, AFAIK. E.g. Apache Arrow, Datafusion, etc.)
Custom src-name and sourceRoot are often required for tools that requires certain source directory name to work.
- But moving away from
*-sourcenaming scheme will still make using flakes inconvenient.
Good point!
That would be an unfortunate cache miss. Nevertheless, isn't it "the Nixy way" to pursue dependency isolation even at the cost of rebuilds (i.e. changing the version of compiler used to compile Bash triggers a world rebuild)?
I wrote a long post with pros/cons of different possible solutions in https://github.com/NixOS/nix/issues/969#issuecomment-2096916208
So, to
That would be an unfortunate cache miss. Nevertheless, isn't it "the Nixy way" to pursue dependency isolation even at the cost of rebuilds (i.e. changing the version of compiler used to compile Bash triggers a world rebuild)?
my answer is that, personally, I would like my build machines to always check all new fixed-output things coming from Nixpkgs, but never check those made by me (unless I ask for it explicitly).
The linked comment describes one way to implement that.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfcsc-meeting-2024-05-14/45414/1
RFCSC:
@oxij We notice that you didn’t accept the nomination, but keep in mind that you don’t need to agree with the RFC to be a shepherd, contrasting views can be very helpful. The role of the shepherd team is to decide whether the community as a whole would benefit from the RFC, having different views in the shepherd team can be helpful to ensure the entire community is heard. Please reconsider if you’d like to be a shepherd after all and let us know.
This pull request has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/rfcsc-meeting-2024-05-28/46113/1