fetchTree attempts download despite narHash existing in store
Describe the bug
The output of fetchTree { type = "file"; narHash = "..."; url = "..."; } seems to depend only on the narHash, since fetching the same file from different URLs gives the same outPath.
However, unlike a fixed-output derivation, fetchTree will try to perform the download (or at least attempts to connect to the URL) even if the outPath already exists. I think this is due to checking a URL-based cache, but not checking whether the store path already exists.
Steps To Reproduce
Use fetchTree with a narHash to fetch a file from a URL, and note its outPath. Then try it with the same narHash and a different URL. It will attempt to connect/download, even though we already have that outPath.
Here's a concrete example, fetching the same file from multiple IPFS gateways (I got this IPFS CID using printf 'hello world' | ipfs block add):
Fetch from ipfs.io with an empty narHash, to find what the narHash should be (I'm on NixOS, but using Nix 2.27 for some unrelated git-hashing fixes):
$ nix repl
Nix 2.27.0pre19700101_dirty
Type :? for help.
nix-repl> builtins.fetchTree { type = "file"; url = "https://ipfs.io/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = ""; }
error:
… while calling the 'fetchTree' builtin
at «string»:1:1:
1| builtins.fetchTree { type = "file"; url = "https://ipfs.io/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = ""; }
| ^
… while fetching the input 'https://ipfs.io/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e?narHash=sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA%3D'
error: NAR hash mismatch in input 'https://ipfs.io/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e?narHash=sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA%3D', expected 'sha256-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=' but got 'sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM='
Fetching with that narHash (the file seems to have been cached):
nix-repl> builtins.fetchTree { type = "file"; url = "https://ipfs.io/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM="; }
{
narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM=";
outPath = "/nix/store/0csgnsbvjfr2axpryskr9v7l43bzjvnd-source";
}
Now we know the narHash, try fetching the same file from a different URL:
nix-repl> builtins.fetchTree { type = "file"; url = "https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM="; }
warning: error: unable to download 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e': Could not resolve hostname (6) Could not resolve host: cloudflare-ipfs.com; retrying in 303 ms
warning: error: unable to download 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e': Could not resolve hostname (6) Could not resolve host: cloudflare-ipfs.com; retrying in 645 ms
warning: error: unable to download 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e': Could not resolve hostname (6) Could not resolve host: cloudflare-ipfs.com; retrying in 1049 ms
warning: error: unable to download 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e': Could not resolve hostname (6) Could not resolve host: cloudflare-ipfs.com; retrying in 2426 ms
error:
… while calling the 'fetchTree' builtin
at «string»:1:1:
1| builtins.fetchTree { type = "file"; url = "https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM="; }
| ^
… while fetching the input 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e?narHash=sha256-rkUEKu9bFIg12wLQRf6JtMCf%2BeR22rABoUvAMi0/IJM%3D'
error: unable to download 'https://cloudflare-ipfs.com/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e': Could not resolve hostname (6) Could not resolve host: cloudflare-ipfs.com
[0.0 MiB DL]
Cloudflare have shut down that IPFS gateway, but we still attempted to connect to it despite already having that file in our store.
If we try another, working URL then it will re-download the file, but the output is identical to using the original ipfs.io URL:
nix-repl> builtins.fetchTree { type = "file"; url = "https://gateway.pinata.cloud/ipfs/bafkreifzjut3te2nhyekklss27nh3k72ysco7y32koao5eei66wof36n5e"; narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM="; }
{
narHash = "sha256-rkUEKu9bFIg12wLQRf6JtMCf+eR22rABoUvAMi0/IJM=";
outPath = "/nix/store/0csgnsbvjfr2axpryskr9v7l43bzjvnd-source";
}
Expected behavior
If the outPath already exists in our store, then those fetchTree calls should return the { narHash = "..."; outPath = "..."; } result immediately, without attempting to download the URL.
Metadata
nix-env (Nix) 2.27.0pre19700101_dirty
Additional context
Related issues:
- https://github.com/NixOS/nix/issues/9570 seems to also be caused by
fetchTreerunning "eagerly", in a way that fixed-output derivations wouldn't. - https://github.com/NixOS/nix/issues/9077 would make
fetchTreeact more like a fixed-output derivation. It doesn't mention the outPath being independent of the input URL, orfetchTreebeing too "eager" to perform a download when it doesn't need to.
I'm currently working around this in a rather clunky way, by using a fixed-output derivation that uses /bin/sh to make a copy of the fetched file. This way, I can query whether the outPath already exists without having to call fetchTree (I use /dev/null instead):
with rec {
inherit (builtins) currentSystem derivation fetchTree getEnv pathExists;
override = getEnv "IPFS_GATEWAY";
gateway = if override == "" then "https://ipfs.io" else override;
fixed = src: derivation {
name = "source";
builder = "/bin/sh";
system = currentSystem;
outputHashMode = "nar";
outputHash = narHash;
args = [
"-c"
''read -r -d "" content < ${src}; printf '%s\n' "$content" > "$out"''
];
};
existing = (fixed "/dev/null").outPath;
file = if pathExists existing then existing else fixed (fetchTree {
inherit narHash;
type = "file";
url = "${gateway}/ipfs/${cid}";
});
};
file
Checklist
- [x] checked latest Nix manual (source)
- [x] checked open bug issues and pull requests for possible duplicates
Add :+1: to issues you find important.
I'm seeing the same thing, with an input of type github. I ran a simple repro like this:
docker run -it --rm nixos/nix
Then this command:
nix eval \
--extra-experimental-features nix-command \
--extra-experimental-features flakes \
--expr 'builtins.fetchTree "github:NixOS/nixpkgs/fdfc4347e915779fe00aca31012e23941b6cd610?narHash=sha256-pCglMme56MWxtTNRWrLj55/eJXw4dX4HmZYXUm6%2BDO4%3D"'
It downloads a whole Nixpkgs checkout. I verified that the reported outPath ended up in the store. Then I run rm -r ~/.cache and run the command again, and it downloads again.
FWIW this was already reported and marked as fixed once before: https://github.com/NixOS/nix/issues/10104.
I poked around in the code and found that the code path which reuses the path in the store requires that isFinal() return true:
https://github.com/NixOS/nix/blob/9ed5482545b609b095d3597ef31eaa64c9ad5ed8/src/libfetchers/fetchers.cc#L313-L320
and isFinal checks some secret attr called __final:
https://github.com/NixOS/nix/blob/9ed5482545b609b095d3597ef31eaa64c9ad5ed8/src/libfetchers/fetchers.cc#L158-L161
My first thought was to try passing __final = true as part of my attrs to builtins.fetchTree, but it seems that's not allowed:
https://github.com/NixOS/nix/blob/9ed5482545b609b095d3597ef31eaa64c9ad5ed8/src/libexpr/primops/fetchTree.cc#L203-L204
See #10612 for the motivation why this is the case (in short: the narHash doesn't guarantee that we have all the other attributes that the fetcher might return, like lastModified and revCount).
A narHash alone is not quite enough to avoid a download. There are more attributes needed that are being provided by ~/.cache and thus forces a reload when those are not there. In other words, that flakeref might be locked in some sense, but can't be used to create a proper lockfile due to the missing attributes without re-fetching or checking the fetcher cache.
Needs some design work.
Needs some clarification if adding those other attributes would fix the problem (lastModified? rev/ref for git attrs., etc?) and if documenting that would be enough.
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/2025-05-04-nix-team-meeting-minutes-230/65206/1
Hey, thanks for getting back to me.
I'll just add one user's perspective: it seems like fetchTree is doing double-duty here: on one hand it's part of the machinery underlying flakes, and on the other hand it's being positioned as a nice unified replacement for all the old builtin fetchers (fetchGit, fetchTarball, etc.). These attributes like lastModified or revCount seem like flake concerns, not fetcher concerns -- so I would hope not to see flake concerns start to affect the fetchers.
Some of the attributes like ref/rev are to allow reliable fetching. The local short-circuit with just narHash is helpful, but doesnt help fetch the correct thing when others are using that url.
Oh yes, I totally expect to use ref/rev like the traditional fetchers do. My example above showed using a URL like builtins.fetchTree "github:NixOS/nixpkgs/<rev>?narHash=<hash>". It's when it comes to the more extraneous-sounding attributes that I become concerned. For one thing, they'd make this URL more cumbersome.
Is there a sensible workaround I can use right now -- is it the case that applying more URL parameters can make it final? The unwanted fetching is a problem.
I would hope not to see flake concerns start to affect the fetchers.
+1 I avoid all the flake stuff, but it's always nice to improve the builtin fetchers (compared to the bad old days of (import <nixpkgs> {}).fetchFromGitHub { owner = "nixos"; name = "nixpkgs"; ... }!)
Some of the attributes like ref/rev are to allow reliable fetching. The local short-circuit with just narHash is helpful, but doesnt help fetch the correct thing when others are using that url.
Could such attributes just be "passed through" from the arguments into the result, if we evaluate builtins.fetchTree { ....; narHash = "..."; ref = "foo"; rev = "bar"; } and that narHash is already in the store, then we get { outPath = "..."; narHash = "..."; ref = "foo"; rev = "bar"; }, with those ref and rev values just copied from those arguments? Or allow passing in __final as an argument, to say we don't care about those things?
It also sounds like substituting has been (temporarily?) disabled for these fetchers, which isn't good for reliability (I've been burned by depending on HTTP URLs multiple times!)
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/nix-copying-a-store-path-into-the-store/60409/16