deploy-rs icon indicating copy to clipboard operation
deploy-rs copied to clipboard

Stuck successfully deploying an old derivation

Open liammcdermott opened this issue 1 year ago â€ĸ 1 comments

Every time I run a deploy, the activation script run on the remote is from an old version of the Flake I'm deploying.

Here is a deployment log:

$ nix run github:serokell/deploy-rs -- --remote-build --skip-checks --debug-logs
🚀 ❓ [deploy] [DEBUG] Checking for flake support
🚀 â„šī¸ [deploy] [INFO] Evaluating flake in .
warning: Git tree '/home/liam/code/hosting' is dirty
evaluation warning: system.stateVersion is not set, defaulting to 25.05. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
evaluation warning: system.stateVersion is not set, defaulting to 25.05. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
evaluation warning: system.stateVersion is not set, defaulting to 25.05. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
evaluation warning: system.stateVersion is not set, defaulting to 25.05. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
evaluation warning: system.stateVersion is not set, defaulting to 25.05. Read why this matters on https://nixos.org/manual/nixos/stable/options.html#opt-system.stateVersion.
🚀 â„šī¸ [deploy] [INFO] The following profiles are going to be deployed:
[backend01.system]
user = "root"
ssh_user = "deploy"
path = "/nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b"
hostname = "backend01.barhost.ca"
ssh_opts = []

[backend01.foo]
user = "foo"
ssh_user = "foo"
path = "/nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo"
hostname = "backend01.barhost.ca"
ssh_opts = []

🚀 ❓ [deploy] [DEBUG] Finding the deriver of store path for /nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b
🚀 â„šī¸ [deploy] [INFO] Building profile `system` for node `backend01` on remote host
🚀 ❓ [deploy] [DEBUG] build command: Command { std: NIX_SSHOPTS="" "nix" "build" "/nix/store/adc67haq45m38hrnvvj4id6pv7h38n6z-activatable-nixos-system-nixos-24.11.20241223.32a1c7b.drv^out" "--eval-store" "auto" "--store" "ssh-ng://[email protected]", kill_on_drop: false }
🚀 ❓ [deploy] [DEBUG] Finding the deriver of store path for /nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo
🚀 â„šī¸ [deploy] [INFO] Building profile `foo` for node `backend01` on remote host
🚀 ❓ [deploy] [DEBUG] build command: Command { std: NIX_SSHOPTS="" "nix" "build" "/nix/store/mp6x4nh2swyx24xvw00455h5x7599hfq-activatable-foo.drv^out" "--eval-store" "auto" "--store" "ssh-ng://[email protected]", kill_on_drop: false }
🚀 â„šī¸ [deploy] [INFO] Activating profile `system` for node `backend01`
🚀 ❓ [deploy] [DEBUG] Constructed activation command: sudo -u root /nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b/activate-rs --debug-logs activate '/nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b' --profile-user root --profile-name system --temp-path '/tmp' --confirm-timeout 30 --magic-rollback --auto-rollback
🚀 ❓ [deploy] [DEBUG] Constructed wait command: sudo -u root /nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b/activate-rs --debug-logs wait '/nix/store/ljimhwgkpksxrpvnlamy6kam1yl8x1dd-activatable-nixos-system-nixos-24.11.20241223.32a1c7b' --temp-path '/tmp'
🚀 â„šī¸ [deploy] [INFO] Creating activation waiter
⭐ â„šī¸ [activate] [INFO] Activating profile
⭐ ❓ [activate] [DEBUG] Running activation script
👀 â„šī¸ [wait] [INFO] Waiting for confirmation event...
activating the configuration...
setting up /etc...
sops-install-secrets: Imported /etc/ssh/ssh_host_rsa_key as GPG key with fingerprint 9645dda94c7b63b74bc7eff6ccc16b3119839339
sops-install-secrets: Imported /etc/ssh/ssh_host_ed25519_key as age key with fingerprint age1hd26hcld0w5ayg388tfwd9sm2438l2g64wszw82dnt9cpeqr45msdnvf4e
reloading user units for foo...
reloading user units for deploy...
reloading user units for baz...
restarting sysinit-reactivation.target
⭐ â„šī¸ [activate] [INFO] Activation succeeded!
⭐ â„šī¸ [activate] [INFO] Magic rollback is enabled, setting up confirmation hook...
⭐ ❓ [activate] [DEBUG] Ensuring parent directory exists for canary file
⭐ ❓ [activate] [DEBUG] Creating canary file
⭐ ❓ [activate] [DEBUG] Creating notify watcher
⭐ â„šī¸ [activate] [INFO] Waiting for confirmation event...
👀 â„šī¸ [wait] [INFO] Found canary file, done waiting!
🚀 ❓ [deploy] [DEBUG] Wait command ended
🚀 â„šī¸ [deploy] [INFO] Success activating, attempting to confirm activation
🚀 ❓ [deploy] [DEBUG] Attempting to run command to confirm deployment: sudo -u root rm /tmp/deploy-rs-canary-ljimhwgkpksxrpvnlamy6kam1yl8x1dd
🚀 â„šī¸ [deploy] [INFO] Deployment confirmed.
⭐ ❓ [activate] [DEBUG] Got worthy removal event, sending on channel
🚀 â„šī¸ [deploy] [INFO] Activating profile `foo` for node `backend01`
🚀 ❓ [deploy] [DEBUG] Constructed activation command: /nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo/activate-rs --debug-logs activate '/nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo' --profile-user foo --profile-name foo --temp-path '/tmp' --confirm-timeout 30 --magic-rollback --auto-rollback
🚀 ❓ [deploy] [DEBUG] Constructed wait command: /nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo/activate-rs --debug-logs wait '/nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo' --temp-path '/tmp'
🚀 â„šī¸ [deploy] [INFO] Creating activation waiter
⭐ â„šī¸ [activate] [INFO] Activating profile
👀 â„šī¸ [wait] [INFO] Waiting for confirmation event...
⭐ ❓ [activate] [DEBUG] Running activation script
⭐ â„šī¸ [activate] [INFO] Activation succeeded!
⭐ â„šī¸ [activate] [INFO] Magic rollback is enabled, setting up confirmation hook...
⭐ ❓ [activate] [DEBUG] Ensuring parent directory exists for canary file
⭐ ❓ [activate] [DEBUG] Creating canary file
⭐ ❓ [activate] [DEBUG] Creating notify watcher
⭐ â„šī¸ [activate] [INFO] Waiting for confirmation event...
👀 â„šī¸ [wait] [INFO] Found canary file, done waiting!
🚀 ❓ [deploy] [DEBUG] Wait command ended
🚀 â„šī¸ [deploy] [INFO] Success activating, attempting to confirm activation
🚀 ❓ [deploy] [DEBUG] Attempting to run command to confirm deployment: rm /tmp/deploy-rs-canary-8l0ch348lkpyrhk9ahap5mgsz13h8n9b
⭐ ❓ [activate] [DEBUG] Got worthy removal event, sending on channel
🚀 â„šī¸ [deploy] [INFO] Deployment confirmed.

If I look at the file /nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo from this line:

🚀 ❓ [deploy] [DEBUG] Constructed activation command: /nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo/activate-rs --debug-logs activate '/nix/store/8l0ch348lkpyrhk9ahap5mgsz13h8n9b-activatable-foo' --profile-user foo --profile-name foo --temp-path '/tmp' --confirm-timeout 30 --magic-rollback --auto-rollback

It's an old version of the activation script. Prior to running the deploy I added echo calls for debugging, and they're not in the deployed script, even after running a deploy multiple times. What's odd is there is no error reported, every time the deployment appears to be successful, but nothing actually changes on the remote.

Abridged version of my flake.nix, in case it's helpful. Note: --skip-checks is there since deploy is run from an x86_64 machine, and the target is an ARM server.

liammcdermott avatar Jan 03 '25 06:01 liammcdermott

I wonder if this may be related to #216 somehow, or remote builds when the host and remote are of different architectures in general.

liammcdermott avatar Jan 03 '25 06:01 liammcdermott