Check target architecture before deploying
Continuation of #64 because I just ran into this again. Error message:
Executing 'switch' on matched hosts:
** raspberry
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 3: use: command not found
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 4: use: command not found
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 5: use: command not found
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 6: use: command not found
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 7: use: command not found
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 8: syntax error near unexpected token `('
/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git/bin/switch-to-configuration: line 8: `use Sys::Syslog qw(:standard :macros);'
Error while activating new configuration.
The script in question:
#! /nix/store/ylsbl86156l99zl8vwcbjzlsdzfgw3l2-perl-5.32.1-env/bin/perl
use strict;
use warnings;
use File::Basename;
use File::Slurp;
use Net::DBus;
use Sys::Syslog qw(:standard :macros);
use Cwd 'abs_path';
my $out = "/nix/store/z6jmxkblhhvrl60ajg61lj489bvg2fmf-nixos-system-amethyst-21.05pre-git";
# FIXME: maybe we should use /proc/1/exe to get the current systemd.
my $curSystemd = abs_path("/run/current-system/sw/bin");
…
The machine is a Raspberry Pi 4 (Linux 5.10.61 #1-NixOS SMP aarch64 GNU/Linux)
So the root cause remains: the perl from the activation script's shebang is compiled for x86_64 Linux, although the hardware is aarch64. Executing the activation script manually gives me zsh: exec format error instead, which makes a lot more sense. However, when executing from a bash I get the stupid error message again. WTF?
So I found the root cause for the issue: morph does not check for whether the architecture for which the configuration was built is correct. The configuration will then contain a shebang to a wrong binary, which will lead to failure. A preliminary check would catch this gracefully.
To the error message, this is a bug in Bash (actually, they say it's not because it's specified in POSIX, so one could call it a bug in POSIX): If the execution of a shebang fails, Bash will try to interpret the file directly. Of course this makes no sense, and since our file is a perl script we thus get the use: command not found error message.
Do you have any good tips for recovering a host that has been morphed with the wrong architecture?
switch-to-configuration is now borked on an x86 host after I mistakenly pushed an aarch64 config, so the zstd on it is not executable.
No, I was fortunate enough that the activation script was in Perl and thus failed before it could do any harm. You may try executing the activation script in emulation mode (with binfmt), but I can't guarantee this will work or even is a good idea. Personally, I'd try to manually force it some new configuration and then morph deploy it over.
I copied over the activation script from a previous generation and that made it happy to deploy again.