haskell.nix
haskell.nix copied to clipboard
Closures of haskell.nix exes are bloated by (unnecessary?) system dependencies, namely GCC.
The following tests were performed on x86_64.
Consider this plain-nixpkgs-test.nix
:
let pkgsrc = builtins.fetchGit
{
url = "https://github.com/nixos/nixpkgs";
ref = "nixos-21.11";
rev = "2d474d6a4a43a0348b78db68dc00c491032cf5cf";
};
pkgs = import pkgsrc {};
in pkgs.haskell.packages.ghc8107.hello
If I build hello
this way with vanilla nixpkgs Haskell machinery, I get essentially the direct runtime deps I'd expect:
$ nix-store --query --references $(nix-build ./plain-nixpkgs-test.nix)
/nix/store/qjgj2642srlbr59wwdihnn66sw97ming-glibc-2.33-123
/nix/store/0pwwaj3scaav84hg7l9dc9mf4l0ikwfp-libffi-3.4.2
/nix/store/mi1pdm9qnvbbpcbf7vkzsbfrfi0xqgja-gmp-6.2.1
And the runtime closure size is reasonably compact:
$ du -hc $(nix-store --query -R $(nix-build ./plain-nixpkgs-test.nix))
...
42M total
If I build the same package with the same GHC version against the same nixpkgs with Haskell.nix:
let pkgsrc = builtins.fetchGit
{
url = "https://github.com/nixos/nixpkgs";
ref = "nixos-21.11";
rev = "2d474d6a4a43a0348b78db68dc00c491032cf5cf";
};
haskellNixSrc = builtins.fetchGit
{
url = "https://github.com/input-output-hk/haskell.nix";
ref = "master";
rev = "82bc94581865b098985890745c371ef6ce67f1ce";
};
pkgs = import pkgsrc ((import haskellNixSrc {}).nixpkgsArgs);
in pkgs.haskell-nix.hackage-package
{
compiler-nix-name = "ghc8107";
name = "hello";
version = "1.0.0.2";
}
I end up with extra dependencies on gcc-lib
, glibc-dev
, libffi-dev
, and gcc
itself (my understanding is that the extra dependency on numactl
is intentional for us)
$ nix-store --query --references $(nix-build ./haskell-nix-test.nix -A components.exes.hello)
/nix/store/qjgj2642srlbr59wwdihnn66sw97ming-glibc-2.33-123
/nix/store/0pwwaj3scaav84hg7l9dc9mf4l0ikwfp-libffi-3.4.2
/nix/store/jvlq57vcmkdng66f0wr4npxsnrp72ysf-gcc-10.3.0-lib
/nix/store/rcjksx6hx0i1sm8q1jhz5a415sb5xcwf-glibc-2.33-123-dev
/nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0
/nix/store/2ysrmiixbxj8fr9zxjlyb214n1dix2kh-libffi-3.4.2-dev
/nix/store/89b3wsmak50i85ikccycj7m9ihi9zgk7-numactl-2.0.14
/nix/store/mi1pdm9qnvbbpcbf7vkzsbfrfi0xqgja-gmp-6.2.1
/nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2
And the runtime closure size is much, much larger (mostly due to the dependency on gcc):
$ df -hc $(nix-store --query -R $(nix-build ./haskell-nix-test.nix -A components.exes.hello))
...
236M total
It turns out the reason we depend on gcc is due to the inclusion of the include path /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0/lib/gcc/x86_64-unknown-linux-gnu/10.3.0/include
as a string somewhere in the exe:
$ nix why-depends /nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2 /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0
/nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2
╚═══bin/hello: …ncludes/stg.includes./nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0/lib/gcc/x86_64-unkno…
=> /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0
It seems strange to me that an include path (and indeed, that directory in gcc seems to contain only header files) is baked as a string into an executable like this; the same gcc store path isn't included in the ELF headers for hello
at all. All of my company's exes built with Haskell.nix have this problem. A cursory search though our builders didn't reveal the source of this to me; does anyone know why we have this extra (and I believe spurious) dependency?
It's the debug symbols. I think there are two alternative fixes:
Strip the symbols
modules = [{ dontStrip = false; }];
Remove the references to gcc so that the symbols will not work for the C header files. Instead of they will point to a nonexistent /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-10.3.0
path.
modules = [
({pkgs, ...}: {
packages.hello.components.exes.hello.postInstall = ''
remove-references-to -t ${pkgs.buildPackages.gcc.cc} $out/bin/*
'';
})
];
Dupe: https://github.com/input-output-hk/haskell.nix/issues/829
Why are we not stripping by default? (Previously asked, but not answered: https://github.com/input-output-hk/haskell.nix/issues/829#issuecomment-684830649)
@angerman Do you have an example of a program that that the stripper breaks?
Stripping will mostly break LLVM build macOS builds. We try to remove .deadstrip_via_symbols
from the assembly these days.
The issue (at least on darwin is as follow; due to Table Next To Code)
The layout in the object file is as follows:
[prefix data]
fsym:
[function body]
we thus rely on the stripper not to drop the prefix data, which if we strip via symbols (as we do on darwin), the linker will just throw away. This can be in principle fixed by emitting an extra dummy symbol prior to the prefix data, and referencing that symbol from fsym
, however making this work across all llvm backends is fairly infeasible.
This should be hopefully fixed from 8.10.7 onwards, though in that case stripping just won't do much (on darwin).
So the answer is that we break on a) MacOS b) when using LLVM c) on old versions of GHC. Sounds like we could turn it on only in those cases then.
Closed by https://github.com/input-output-hk/haskell.nix/pull/1476
Ignore me, this issue is about stripping, not data outputs.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
I still think stripping more by default could be a good idea.