nix-darwin icon indicating copy to clipboard operation
nix-darwin copied to clipboard

SSH connection with linux-builder fails

Open workingdoge opened this issue 11 months ago • 10 comments

I'm able to ssh into linux-builder when I manually pass the ssh key and config, so the instance is running.

Failing with error: connecting to 'ssh-ng://builder@linux-builder'... cannot build on 'ssh-ng://builder@linux-builder': error: failed to start SSH connection to 'builder@linux-builder'

  nix = {
    settings = {
      trusted-users = [
        currentSystemUser
        "@admin"
      ];
      extra-trusted-users = [
        "@admin"
        currentSystemUser
      ];
      experimental-features = ["nix-command" "flakes" "repl-flake"];
      keep-outputs = true;
      keep-derivations = true;
    };
    extraOptions = ''
      extra-platforms = x86_64-darwin aarch64-darwin
    '';
    linux-builder = {
      enable = true;
      ephemeral = true;
      maxJobs = 4;
      config = {
        virtualisation = {
          darwin-builder = {
            diskSize = 40 * 1024;
            memorySize = 8 * 1024;
          };
          cores = 6;
        };
      };
    };
  };

Not quite sure how to introspect the ssh session when calling nix build

workingdoge avatar Mar 23 '24 00:03 workingdoge

I've been working on something similar.

One thing I did find, which may help you, is that if you are going to overwrite "config" it might be necessary to also enable SSH. In other words, add services.openssh.enable = true to config = { .. }

Secondly, it doesn't seem like logging is enabled by default. I had to add the following to my top level config: launchd.daemons.linux-builder = { serviceConfig = { StandardOutPath = "/var/log/darwin-builder.log"; StandardErrorPath = "/var/log/darwin-builder.log"; }; }; After which, I could view logs on my host machine located at "/var/log/darwin-builder.log".

Hope some of this helps.

daveterra avatar Apr 28 '24 16:04 daveterra

I'm able to successfully build with the linux-builder however unable to personally ssh in. What is the password is it asking for?

georgealexanderday avatar May 01 '24 19:05 georgealexanderday

@georgealexanderday it would help to see full output as well as your invocation. Even better if you add some debugging to the invocation with -vvv. For example, if you haven't setup a user with your ssh key, you need to specify a user such as ssh builder@linux-builder.

For what it's worth, I'm able to SSH directly to the builder VM from a fresh setup.

LoganBarnett avatar Jul 04 '24 04:07 LoganBarnett

If you are looking to ssh into the image I have managed to do that by using the private key generated when the machine image is created; e.g. edit .ssh/config to contain an alias:

Host linux-builder
  User builder
  Hostname localhost
  HostKeyAlias linux-builder
  IdentityFile /etc/nix/builder_ed25519
  Port 31022

In this case the private key has been placed in /etc/nix/builder_ed25519 by nix-darwin. However I am having the same issue as workingdodge where I can ssh into the image but I can not seem to get ssh-ng to work when using the machine image as a builder.

Montmorency avatar Jul 08 '24 14:07 Montmorency

Actually after further investigation I managed to get it to work I had updated the permissions to 644 on the /etc/nix/builder_ed25519 in order to run the build as normal user (it could not access /etc/nix/ as a user group) however ssh complaned that 644 is too open for a private key. Updating private key to be 600 and then running build e.g.: sudo nix build --impure --option sandbox false .#packages.x86_64-linux.unoptimized-prod-server then picks up the builder properly.

Montmorency avatar Jul 08 '24 15:07 Montmorency

I can confirm a similar behavior:

  • sudo chmod 644 /etc/nix/builder_ed25519 makes ssh builder@linux-builder work
  • nix build .#xyz only works with 600

Zaunei avatar Jul 19 '24 08:07 Zaunei

In case this helps anyone else: To fix the errors "cannot build on {...}: error: failed to start SSH connection to {...}: Permission denied, please try again." and "error: failed to start SSH connection to {...}" (when trying to build using a remote NixOS build machine), two things turned out to be important in my case:

  • Make sure root on the local machine has passwordless SSH access to the user account on the build machine. (This can be achieved on the build machine as usual, either by appending to ~/.ssh/authorized_keys or doing it declaratively in configuration.nix with users.users.joe.openssh.authorizedKeys.keys = [ "{public key}" ] .)
  • Make sure the username is included everywhere in Nix configuration on the local machine. In my case, I had to put ssh://joe@{build-machine} instead of just ssh://{build-machine} in trusted-substituters, substituters, and builders — even when the user on my local machine was also named "joe".

shlok avatar Aug 20 '24 15:08 shlok

For me, setting nix.linux-builder.ephemeral = true then applying my config resolved this. No idea why. Possibly some kind of bad state that the rm -f /var/lib/darwin-builder/nixos.qcow2 helped with.

clo4 avatar Aug 30 '24 22:08 clo4

+1 to @clo4's suggestion, my login issue was also solved by wiping the state once.

But before the wipe, I also set up @daveterra's logfile idea and found this in the log of the VM:

User not known to the underlying authentication module

ofalvai avatar Sep 01 '24 10:09 ofalvai

@ofalvai I had the same in my logs too 🫣 It also looked like it was trying to start multiple times, but I don't know enough about the internals of this to know why that could be.

clo4 avatar Sep 01 '24 10:09 clo4