haskell.nix icon indicating copy to clipboard operation
haskell.nix copied to clipboard

Closures of haskell.nix exes are bloated by (unnecessary?) system dependencies, namely GCC.

Open TravisWhitaker opened this issue 2 years ago • 7 comments

The following tests were performed on x86_64.

Consider this plain-nixpkgs-test.nix:

let pkgsrc = builtins.fetchGit
    {
        url = "https://github.com/nixos/nixpkgs";
        ref = "nixos-21.11";
        rev = "2d474d6a4a43a0348b78db68dc00c491032cf5cf";
    };
    pkgs = import pkgsrc {};
in pkgs.haskell.packages.ghc8107.hello

If I build hello this way with vanilla nixpkgs Haskell machinery, I get essentially the direct runtime deps I'd expect:

$ nix-store --query --references $(nix-build ./plain-nixpkgs-test.nix)
/nix/store/qjgj2642srlbr59wwdihnn66sw97ming-glibc-2.33-123
/nix/store/0pwwaj3scaav84hg7l9dc9mf4l0ikwfp-libffi-3.4.2
/nix/store/mi1pdm9qnvbbpcbf7vkzsbfrfi0xqgja-gmp-6.2.1

And the runtime closure size is reasonably compact:

$ du -hc $(nix-store --query -R $(nix-build ./plain-nixpkgs-test.nix))
...
42M     total

If I build the same package with the same GHC version against the same nixpkgs with Haskell.nix:

let pkgsrc = builtins.fetchGit
    {
        url = "https://github.com/nixos/nixpkgs";
        ref = "nixos-21.11";
        rev = "2d474d6a4a43a0348b78db68dc00c491032cf5cf";
    };
    haskellNixSrc = builtins.fetchGit
    {
        url = "https://github.com/input-output-hk/haskell.nix";
        ref = "master";
        rev = "82bc94581865b098985890745c371ef6ce67f1ce";
    };
    pkgs = import pkgsrc ((import haskellNixSrc {}).nixpkgsArgs);
in pkgs.haskell-nix.hackage-package
{
    compiler-nix-name = "ghc8107";
    name = "hello";
    version = "1.0.0.2";
}

I end up with extra dependencies on gcc-lib, glibc-dev, libffi-dev, and gcc itself (my understanding is that the extra dependency on numactl is intentional for us)

$ nix-store --query --references $(nix-build ./haskell-nix-test.nix -A components.exes.hello)
/nix/store/qjgj2642srlbr59wwdihnn66sw97ming-glibc-2.33-123
/nix/store/0pwwaj3scaav84hg7l9dc9mf4l0ikwfp-libffi-3.4.2
/nix/store/jvlq57vcmkdng66f0wr4npxsnrp72ysf-gcc-10.3.0-lib
/nix/store/rcjksx6hx0i1sm8q1jhz5a415sb5xcwf-glibc-2.33-123-dev
/nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0
/nix/store/2ysrmiixbxj8fr9zxjlyb214n1dix2kh-libffi-3.4.2-dev
/nix/store/89b3wsmak50i85ikccycj7m9ihi9zgk7-numactl-2.0.14
/nix/store/mi1pdm9qnvbbpcbf7vkzsbfrfi0xqgja-gmp-6.2.1
/nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2

And the runtime closure size is much, much larger (mostly due to the dependency on gcc):

$ df -hc $(nix-store --query -R $(nix-build ./haskell-nix-test.nix -A components.exes.hello))
...
236M    total

It turns out the reason we depend on gcc is due to the inclusion of the include path /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0/lib/gcc/x86_64-unknown-linux-gnu/10.3.0/include as a string somewhere in the exe:

$ nix why-depends /nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2 /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0
/nix/store/sf52saiflj3wq6a838wm63vnhpi6pviy-hello-exe-hello-1.0.0.2
╚═══bin/hello: …ncludes/stg.includes./nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0/lib/gcc/x86_64-unkno…
    => /nix/store/13b8kc6nxq316yfrhrprzkcw9m84zvbw-gcc-10.3.0

It seems strange to me that an include path (and indeed, that directory in gcc seems to contain only header files) is baked as a string into an executable like this; the same gcc store path isn't included in the ELF headers for hello at all. All of my company's exes built with Haskell.nix have this problem. A cursory search though our builders didn't reveal the source of this to me; does anyone know why we have this extra (and I believe spurious) dependency?

TravisWhitaker avatar May 19 '22 22:05 TravisWhitaker

It's the debug symbols. I think there are two alternative fixes:

Strip the symbols

modules = [{ dontStrip = false; }];

Remove the references to gcc so that the symbols will not work for the C header files. Instead of they will point to a nonexistent /nix/store/eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee-gcc-10.3.0 path.

modules = [
  ({pkgs, ...}: {
    packages.hello.components.exes.hello.postInstall = ''
      remove-references-to -t ${pkgs.buildPackages.gcc.cc} $out/bin/*
    '';
  })
];

hamishmack avatar May 20 '22 04:05 hamishmack

Dupe: https://github.com/input-output-hk/haskell.nix/issues/829

Why are we not stripping by default? (Previously asked, but not answered: https://github.com/input-output-hk/haskell.nix/issues/829#issuecomment-684830649)

michaelpj avatar May 20 '22 09:05 michaelpj

@angerman Do you have an example of a program that that the stripper breaks?

TravisWhitaker avatar May 23 '22 22:05 TravisWhitaker

Stripping will mostly break LLVM build macOS builds. We try to remove .deadstrip_via_symbols from the assembly these days.

The issue (at least on darwin is as follow; due to Table Next To Code)

The layout in the object file is as follows:

        [prefix data]
fsym: 
        [function body]

we thus rely on the stripper not to drop the prefix data, which if we strip via symbols (as we do on darwin), the linker will just throw away. This can be in principle fixed by emitting an extra dummy symbol prior to the prefix data, and referencing that symbol from fsym, however making this work across all llvm backends is fairly infeasible.

This should be hopefully fixed from 8.10.7 onwards, though in that case stripping just won't do much (on darwin).

angerman avatar May 24 '22 02:05 angerman

So the answer is that we break on a) MacOS b) when using LLVM c) on old versions of GHC. Sounds like we could turn it on only in those cases then.

michaelpj avatar May 24 '22 09:05 michaelpj

Closed by https://github.com/input-output-hk/haskell.nix/pull/1476

michaelpj avatar May 24 '22 12:05 michaelpj

Ignore me, this issue is about stripping, not data outputs.

michaelpj avatar May 24 '22 12:05 michaelpj

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 28 '22 14:09 stale[bot]

I still think stripping more by default could be a good idea.

michaelpj avatar Sep 28 '22 15:09 michaelpj