nix icon indicating copy to clipboard operation
nix copied to clipboard

Bug: The build fails if a build machine/cache is offline

Open NorfairKing opened this issue 5 years ago • 16 comments
trafficstars

Describe the bug

I set up my desktop computer as a build machine and binary cache for my laptop. when I turn off my desktop, every build on my laptop fails.

I can't tell if it is because the desktop is a build machine, or because it's a binary cache, but in both cases this should not be happening.

Steps To Reproduce

  1. Set up machine B as a build machine and binary cache for machine A
  2. Turn off machine B
  3. Run nix-build on machine A

Expected behavior

A builds everything itself.

nix-env --version output

$ nix-env --version
nix-env (Nix) 2.3.6

NorfairKing avatar Apr 19 '20 13:04 NorfairKing

@edolstra How would I go about pushing this forward?

NorfairKing avatar Aug 13 '20 04:08 NorfairKing

A good first step is to follow the issue template: https://github.com/NixOS/nix/issues/new?assignees=&labels=bug&template=bug_report.md&title=

zimbatm avatar Aug 13 '20 08:08 zimbatm

@zimbatm Is this better?

NorfairKing avatar Aug 13 '20 17:08 NorfairKing

I marked this as stale due to inactivity. → More info

stale[bot] avatar Feb 12 '21 05:02 stale[bot]

Still relevant.

NorfairKing avatar Feb 12 '21 07:02 NorfairKing

@rickynils might be interested in pursuing this since he is working on nixbuild.net

zimbatm avatar Feb 12 '21 10:02 zimbatm

Still relevant. In Nixos nix.binaryCaches is a list, so hard-failing on a first item being offline is a bug and completely counter-intuitive. Also, error message has to print a suggestion to use --option substituters (or any other currently accepted workaround).

ZoomRmc avatar Mar 17 '21 10:03 ZoomRmc

@ZoomRmc can you expand on how to use --option substituters? I don't see anything about it in nixos-rebuild --help.

edit: I'll just add it myself since I found it elsewhere --option substitute false

ursi avatar Sep 09 '21 22:09 ursi

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/ignore-offline-substituters/15450/4

nixos-discourse avatar Oct 12 '21 22:10 nixos-discourse

I can't tell if it is because the desktop is a build machine, or because it's a binary cache, but in both cases this should not be happening.

For me when remote builders are offline, that causes no issue, just a fast:

cannot build on 'ssh://[email protected]': error: cannot connect to '[email protected]': ssh: Could not resolve hostname superfastmachine.local: Name or service not known

And then it continues to build on the local machine, but I previously used ip's instead of hostnames, and then it hung a lot longer before it continued to build on local machine.

But when using binary cache like this:

substituters = http://superfastmachine.local:5000/ https://cache.nixos.org/

in nix.conf, and my pc wants to download something from there, I get:

warning: error: unable to download 'http://superfastmachine.local:5000/wbjfdccsii8wcnawlgg1a72i2vazfg4b.narinfo': Couldn't resolve host name (6); retrying in 336 ms
disabling binary cache 'http://superfastmachine.local:5000' for 60 seconds
error: unable to download 'http://superfastmachine.local:5000/0wxn3wnk6qiv5kzl0w8abv9jzh8szgqz.narinfo': Couldn't resolve host name (6)
error: unexpected end-of-file

And I need to override --option substituters https://cache.nixos.org to exclude the unavailable binary cache to be able to finish the build. I thought fallback = true in nix.conf would help, but it did not.

afreakk avatar Dec 02 '21 21:12 afreakk

I marked this as stale due to inactivity. → More info

stale[bot] avatar Jul 11 '22 00:07 stale[bot]

unstale bot

NorfairKing avatar Jul 11 '22 12:07 NorfairKing

Still important.

tbidne avatar Dec 01 '22 21:12 tbidne

Related: https://github.com/NixOS/nix/issues/3796, https://github.com/NixOS/nix/issues/6901

tbidne avatar Dec 01 '22 23:12 tbidne

I'm pretty sure fallback = true is supposed to fix this, I use this exact setup locally. I'm not sure why that didn't work for @afreakk, maybe a bug that's been fixed now? You'll also need to set connect-timeout = 5 or something else low otherwise the build will hang for minutes, I talked about this in more detail here.

Also related is #7188, which should fix this without needing to set fallback = true.

arcuru avatar Dec 04 '22 17:12 arcuru

Setting fallback = true does indeed allow me to build, however this does trigger a stream of error: opening a connection to remote store 'ssh-ng://missing-server' previously failed messages. It'd be nice to provide a way to mark the server as truly optional so that these messages can be avoided.

I set fallback = true like this:


  # Setup the SSH keys for the machines we want to build against.
  programs.ssh = {
      extraConfig = ''
        Host missing-server

            # <snip>

            # Use an aggressive timeout because we're not always on
            # the LAN
            ConnectTimeout 3
    '';
  };


  nix = {
    settings = {

      # <snip>

      # Private binary cache
      substituters = [ "ssh-ng://missing-server" ];
    };

    extraOptions = ''
    # Ensure we can still build when missing-server is not accessible
    fallback = true
    '';
  }

johnhamelink avatar Aug 27 '24 17:08 johnhamelink

If this isn't a footnuke I don't know what is

philipwilk avatar May 09 '25 02:05 philipwilk

A similarly annoying behavior is when you have a private cache that requires an authentication token and that token has expired, builds will fail.

Why can't nix just skip the substituter it can't access?

m4dc4p avatar May 23 '25 16:05 m4dc4p

Not sure what is the current status of the two other prs relating to this, https://github.com/NixOS/nix/pull/7188 and https://github.com/NixOS/nix/pull/8983, is because both of them seem quiet.

philipwilk avatar May 30 '25 22:05 philipwilk

This issue has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/what-is-the-best-practice-to-use-binary-cache-for-this-situation/66212/2

nixos-discourse avatar Jul 21 '25 10:07 nixos-discourse

Hopefully fixed by #13301

Ericson2314 avatar Sep 26 '25 15:09 Ericson2314