stackage2nix
stackage2nix copied to clipboard
Build stackage2nix in NixOS sandbox
Unable to build nix/stackage2nix
on NixOS with nix.useSandbox enabled.
nix.useSandbox If set, Nix will perform builds in a sandboxed environment that it will set up automatically for each build. This prevents impurities in builds by disallowing access to dependencies outside of the Nix store. This isn't enabled by default for performance. It doesn't affect derivation hashes, so changing this option will not trigger a rebuild of packages.
related #40
Currently, I see no ways of sandboxing the stackage2nix wrapper. See the issues below.
stackage2nix wrapper requires following dependencies to be fetched, see nix/lib.nix
- [x] fpco/lts-haskell
- [x] fpco/stackage-nightly
- [ ] commercialhaskell/all-cabal-hashes
- [ ] hackage-db at https://hackage.haskell.org/01-index.tar.gz
To be able to satisfy the sandbox requirements, all these dependencies should be prefetched before the build by the standard nix-prefetch-scripts
.
Stackage config files
Only files are needed, so fetchgit
can be used to fetch fpco/lts-haskell
and fpco/stackage-nightly
dependencies. The only issue with this approach is less convenient updates, because it would require updating revision and hash for both repos, instead of bumping single cacheVersion
parameter.
all-cabal-hashes
To build the exact copy of stackage packages set, stackage2nix
searches for a project definitions in all-cabal-hashes
by a hash defined in stackage config (single version of the package may have different revisions). In order to do so, all-cabal-hashes
should be fetched with git metadata. Due to NixOs/nixpkgs #8567 there is no reliable way to do this with fetchgit
.
The solution might be to fetch zip archive of a particular version. AFAIK, Github is able to create such links but only for archives containing project files, without metadata.
hackage-db
An issue with hackage-db is that URL doesn't have a particular version to put in fetchurl
script. I'm assuming that hackage-db could be recreated from all-cabal-hashes
repo, but I'm not sure how. Other solution would be to fetch versioned db from some other place.
all-cabal-hashes
stuff in callHackage
: https://github.com/NixOS/nixpkgs/blob/43a62b66d0175b10fd3cc6f1fabdec9d205c171c/pkgs/development/haskell-modules/make-package-set.nix#L126
Regarding the non-determinism of all-cabal-hashes
.
I've found this old comment on the original issue thread. The idea is to unpack the git objects and store them uncompressed https://github.com/bendlas/nixpkgs/commit/4b9c24a5d33407f88457d7e125ca78cbefa30afa
We should be able to do this unpacking as a postUnpack
build step.
Downsides:
- will lead to increased size of git repository
Upsides:
- deterministic
fetchgit
- (should be checked) we can access those objects through the
libgit
interface (no changes are needed forstackage2nix
itself)
Do you really care about the git history or is it because the tool wants to query the current reference of the checkout?
For the latter, it could make sense to re-build a fake .git
database with only the following files:
.git/HEAD -> ref: refs/heads/master
.git/refs/heads/master -> e843a2271a972b8cb6401e67f25d22c8f6fa68cb
@zimbatm It's the mapping from sha1 to a file content that is needed.
so the tool is not looking at the checked-out content but querying the git database directly instead?
if you go down the fetchgit + unpacked blobs maybe you can make it smaller by using a shallow copy of the database.
given the level of effort involved it could make sense to patch upstream as well
@zimbatm The full history is still needed, as we need all blobs reachable from the required commit.
I've discussed this with @4e6, and I think I'll just make a small tool that will create a canonical representation of git .pack file. So if everything (branches, tags) is properly pruned before that, the result will be a working git checkout that is also reproducible. I'll experiment with this approach here. If it'll work out, I try to do the same in the fetchgit
itself.
I tried the approach referenced in my previous comment with the unpacking of git objects https://github.com/bendlas/nixpkgs/commit/4b9c24a5d33407f88457d7e125ca78cbefa30afa
This led to the increase of all-cabal-hashes
checkout size from 1.6 Gb to 16 Gb, which is not acceptable.
Maybe we can use the github zip archive? It should allow fast random reads.
Maybe we can use the github zip archive? It should allow fast random reads.
Filenames are used only as a fallback, primary addressing method is by GitSHA1. So a full .git-repo is needed.
As I understand it, the bare git repo is only used because it is more compact than doing a repo checkout. However, there is no good way to get an up-to-date one within a nix sandbox. I had to revert 86f11b89 while working on updating nixpkgs-stackage.
Getting the latest .zip is trivial (builtins.fetchurl
), way faster (20s vs 1m20s for git clone) and way smaller (189MB vs 366MB). Zip allows random access for decompression, so should be fast to grab files out of.
@yorickvP To make a latest .zip usable, you need to calculate GitSHA1 of every file inside of it and cache this info somewhere. It's doable, except for hackage revisions (just grep by x-revision
) - there'll be only the latest revision available, without any way to fetch older ones. And that is what being solved by having a .git-folder.
Proper solution is to create some canonical representation of a .git repo which will be reproducible. Maybe that will require writing a custom git .pack file generator.
Does stack even expose the used cabal file revision? The intractability of the problem does not seem worth any of the potential savings of using older cabal files sometimes, assuming cabal files are rarely updated and do not break anything.
@yorickvP Yes, it's exposed - e.g. search for GitSHA1
in https://raw.githubusercontent.com/commercialhaskell/lts-haskell/master/lts-12.16.yaml
If non-breaking updates are OK, why you've enabled the sandboxing? =)