cabal2nix
cabal2nix copied to clipboard
Split `hackage-packages.nix` into multiple files
hackage-packages.nix
in NixOS is ever growing and I think it is time to think about making hackage2nix
generate a file per package and a file callPackage
-ing those files. The reasons are as follows:
-
Nix's laziness would mean that upon evaluation of a single attribute, we have to parse less Nix code. Parsing
hackage-packages.nix
is quite expensive and it has a constant impact of over half a second on evaluating anything Haskell related. -
It could also be beneficial for repository size. As I understand it, git is better at deduplicating lots of small fails, rather than one big file, so getting rid of the
hackage-package.nix
behemoth could be helpful here, too. This is, however, based on my very limited understanding and would need to be confirmed by experimentation or someone more knowlegdable than me.
I am confused. That idea sounds to good to be true. Also I don‘t think any other auto generated part of nixpkgs does this? There have to be reasons why this has not been done.
I think it's just easier to generate a single file and more obvious. As you can see from the top 10 of files with the most lines in nixpkgs, we really are in a league of our own, so my guess is this is not a problem anyone else really had to consider.
13389 ./pkgs/servers/nosql/influxdb2/influx-ui-yarndeps.nix
13805 ./pkgs/servers/jellyfin/node-deps.nix
14029 ./pkgs/applications/version-management/gitlab/yarnPkgs.nix
15450 ./pkgs/development/compilers/elm/packages/node-packages.nix
17524 ./pkgs/development/r-modules/cran-packages.nix
24655 ./pkgs/top-level/perl-packages.nix
32847 ./pkgs/top-level/all-packages.nix
36831 ./pkgs/tools/typesetting/tex/texlive/pkgs.nix
125326 ./pkgs/development/node-packages/node-packages.nix
298210 ./pkgs/development/haskell-modules/hackage-packages.nix
One question I had is that I guess when we do this, we'd have one file that looks like:
pkgs/development/haskell-modules/hackage-packages.nix
:
{ callPackage }:
{
...
aeson = callPackage ./haskellPackages/aeson.nix {};
...
conduit = callPackage ./haskellPackages/conduit.nix {};
...
lens = callPackage ./haskellPackages/lens.nix {};
...
}
Then we'd have all our individual packages in .nix
files in directory like pkgs/development/haskell-modules/haskellPackages/
.
So for example pkgs/development/haskell-modules/haskellPackages/aeson.nix
would look like:
{ mkDerivation, attoparsec, base, base-compat
, base-compat-batteries, base-orphans, base16-bytestring
, bytestring, containers, data-fix, deepseq, Diff, directory, dlist
, filepath, generic-deriving, ghc-prim, hashable, hashable-time
, integer-logarithms, primitive, QuickCheck, quickcheck-instances
, scientific, strict, tagged, tasty, tasty-golden, tasty-hunit
, tasty-quickcheck, template-haskell, text, th-abstraction, these
, time, time-compat, unordered-containers, uuid-types, vector
}:
mkDerivation {
pname = "aeson";
version = "1.5.6.0";
sha256 = "1s5z4bgb5150h6a4cjf5vh8dmyrn6ilh29gh05999v6jwd5w6q83";
revision = "2";
editedCabalFile = "1zxkarvmbgc2cpcc9sx1rlqm7nfh473052898ypiwk8azawp1hbj";
libraryHaskellDepends = [
attoparsec base base-compat-batteries bytestring containers
data-fix deepseq dlist ghc-prim hashable primitive scientific
strict tagged template-haskell text th-abstraction these time
time-compat unordered-containers uuid-types vector
];
testHaskellDepends = [
attoparsec base base-compat base-orphans base16-bytestring
bytestring containers data-fix Diff directory dlist filepath
generic-deriving ghc-prim hashable hashable-time integer-logarithms
QuickCheck quickcheck-instances scientific strict tagged tasty
tasty-golden tasty-hunit tasty-quickcheck template-haskell text
these time time-compat unordered-containers uuid-types vector
];
description = "Fast JSON parsing and encoding";
license = lib.licenses.bsd3;
}
If this is the approach we took, then pkgs/development/haskell-modules/haskellPackages/
would have about 16,000 files in it (since there are currently about 16,000 packages on Hackage?).
Would having 16000 files in a directory cause any problems?
Would having 16000 files in a directory cause any problems?
Yeah, that is the big question.
Splitting into 26 subdirectories should give ~1000 files in the largest case. Pretty ugly though.
No modern filesystem should have any issue with that number however, nor git.
On Thu, 23 Sep 2021, 5:19 pm sterni, @.***> wrote:
Would having 16000 files in a directory cause any problems?
Yeah, that is the big question.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NixOS/cabal2nix/issues/518#issuecomment-925640063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGRJXGD6FG7MNS4OCP3C2TUDLWKPANCNFSM5EPUQ2KA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
Nix's laziness would mean that upon evaluation of a single attribute, we have to parse less Nix code. Parsing
hackage-packages.nix
is quite expensive and it has a constant impact of over half a second on evaluating anything Haskell related.
Have you actually tested whether this is true? I believe that Nix parses included files even if they are not actually needed for the evaluation. I may be wrong (or the behavior might have changed), but I guess it's a good idea to test it.
Have you actually tested whether this is true? I believe that Nix parses included files even if they are not actually needed for the evaluation. I may be wrong (or the behavior might have changed), but I guess it's a good idea to test it.
Oh, that is a good hint, I'll have to check that. Testing in general would be needed for this, for example I'm not sure if git performance may degrade if it has update many extra individual files (in the tens of thousands) instead of a single big file…