nixos-images
nixos-images copied to clipboard
kexec fails due to IMA being enforced on Azure VMs
kexec
fails due to IMA (Integrity Measurement Architecture) being enforced on Azure, I'm using nixos-anywhere and just saw that the image comes from here for unattended install.
See here : https://github.com/numtide/nixos-anywhere/issues/189
I want to know, do I need to build a new image in order to use kexec -s
instead of kexec
?
It is due to IMA
appraisal being enabled on Azure VMs :
[ 3099.239362] ima: impossible to appraise a kernel image without a file descriptor; try using kexec_file_load syscall.
More details here : https://kernsec.org/pipermail/linux-security-module-archive/2018-October/008951.html
To build, a compatible image, I should try and modify the build-images.sh
script to my needs ?
We now pass this flag but it's not clear to me what else is needed
@Ma27 I will investigate thoroughly more during the coming week and report back if I find a solution. I will try to see if I can find a way to enroll/sign the kernel as being to get executed on Azure, if I find a way to make it work, I'll let you know the steps I took.
@Mic92 , @AkechiShiro: FYI: we have been successfully trialing nixos-anywhere with Azure Gen2 'Standard B' image types as described here: https://github.com/tiiuae/ghaf-infra/blob/main/docs/nixos-anywhere.md.
Hi @henrirosten
I'm not sure what you mean by Azure Gen 2 Standard B images ? Is the securityType of the VM TrustedLaunch ? Could you give more information ?
nixos-anywhere fails to kexec due to a missing signature (SecureBoot being enabled and enforced).
Even disabling Integrity Measurement doesn't seem enough.
For more context, trying to modprobe unsigned kernel drivers also fails
'Standard' is the Azure security type that disables secure boot and IMA.
'B'-series refers to Azure VM image sizes which are deployed on hardware types and processors as described here: https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-b-series-burstable.
There must be a way forward by which we could push in the official Azure Marketplace an Azure compatible NixOS image then, we just need to try and work with Lanzaboote folks and see if we can find a way to combine NixOS + Lanzaboote in order to have at least SecureBoot support, IMA will have to be disabled at first.
vTPM doesn't really matter, I'd guess, at first. But having nixos-anywhere compatible with other SecureBoot distributions seems to be a very non-trivial feat, the only way/workaround, I see that is possible, is to disable SecureBoot temporarily, use nixos-anywhere, then activate it back, but what will happen ? Since nixos-anywhere doesn't ship Lanzaboote, in the NixOS image I believe...
@Mic92 : Would a PR showcasing the steps to use nixos-anywhere on Azure gen 2 VMs that have been created using SecurityType : TrustedLaunch
and not Standard
by disabling SecureBoot temporarily would be something, acceptable for now ? Or would it be useless ?
There is some documentation that is there for anyone interested about testing their non-official NixOS VM image : https://learn.microsoft.com/en-us/partner-center/marketplace/azure-vm-image-test
Anyone interested on working on this, I'd be willing to progress on it slowly as much as I can, if I can commit enough time to make progress on it.
@AkechiShiro you mean having a guide that describes how to install on Azure with nixos-anywhere? Sure. Could be dropped here: https://github.com/nix-community/nixos-anywhere/tree/main/docs/howtos
Here is one idea: Shouldn't it be possible kexec into the original kernel but with ima_appraise=off
and than do the actual nixos kexec afterwards?
@Mic92 I will try that soon, but I've tried this on a debian 11 Cloud image and was still stuck with some weird issue I couldn't debug at all, but I'll need to check/retry again.
If it was just an old kernel than https://github.com/nix-community/nixos-images/commit/eaf2d21fa940a86ef7bc2b583850f725b86dc180 might solve it.
Hi @Mic92, Sorry for the time taken to give it a try, it took me awhile.
I gave a try to run as root under a machine with Secure Boot disabled and ima_appraisal=off
:
curl -L https://github.com/nix-community/nixos-images/releases/download/nixos-unstable/nixos-kexec-installer-noninteractive-x86_64-linux.tar.gz | tar -xzf- -C /root
/root/kexec/run
I got this output after the reboot, after the kexec I believe, seems like something bad happened ?
username login: [ 10.089786] CPU1 failed to report alive state [ 10.129163] BUG: kernel NULL pointer dereference, address: 0000000000000010 [ 10.129779] #PF: supervisor read access in kernel mode [ 10.129779] #PF: error_code(0x0000) - not-present page [ 10.129779] PGD 0 P4D 0 [ 10.129779] Oops: 0000 [#1] PREEMPT SMP PTI [ 10.129779] CPU: 0 PID: 11 Comm: kworker/u4:0 Not tainted 6.6.10 #1-NixOS [ 10.129779] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 07/12/2023 [ 10.129779] Workqueue: eval_map_wq tracer_init_tracefs_work_func [ 10.129779] RIP: 0010:event_create_dir+0x29/0x5d0 [ 10.129779] Code: 90 41 57 41 56 41 55 41 54 49 89 f4 55 53 48 83 ec 18 48 8b 46 28 4c 8b 6e 10 48 c7 c6 71 0b 9a 87 48 89 7c 24 08 48 89 04 24 8b 45 10 48 8b 18 48 89 df e8 58 40 8e 00 85 c0 0f 84 ec 04 00 [ 10.129779] RSP: 0000:ffffa21d00093dd8 EFLAGS: 00010296 [ 10.129779] RAX: 0000000000000000 RBX: ffff8bc14020e1e0 RCX: ffff8bc140808080 [ 10.129779] RDX: 0000000000000000 RSI: ffffffff879a0b71 RDI: ffff8bc140442b40 [ 10.129779] RBP: ffffffff88155260 R08: ffff8bc140b6c060 R09: 0000000000038ee0 [ 10.129779] R10: ffff8bc140c3f080 R11: 006e776f64726165 R12: ffff8bc14020e1e0 [ 10.129779] R13: 0000000000000000 R14: ffff8bc1402ed405 R15: ffffffff8875c948 [ 10.129779] FS: 0000000000000000(0000) GS:ffff8bc1fbc00000(0000) knlGS:0000000000000000 [ 10.129779] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 10.129779] CR2: 0000000000000010 CR3: 000000003d220001 CR4: 00000000003706f0 [ 10.129779] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 10.129779] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 10.129779] Call Trace: [ 10.129779][ 10.129779] ? __die+0x23/0x70 [ 10.129779] ? page_fault_oops+0x17d/0x4b0 [ 10.129779] ? exc_page_fault+0x6d/0x150 [ 10.129779] ? asm_exc_page_fault+0x26/0x30 [ 10.129779] ? event_create_dir+0x29/0x5d0 [ 10.129779] ? event_create_dir+0x123/0x5d0 [ 10.129779] __trace_early_add_event_dirs+0x33/0x70 [ 10.129779] event_trace_init+0x98/0xf0 [ 10.129779] tracer_init_tracefs_work_func+0xa/0x2e0 [ 10.129779] process_one_work+0x174/0x340 [ 10.129779] worker_thread+0x27b/0x3a0 [ 10.129779] ? __pfx_worker_thread+0x10/0x10 [ 10.129779] kthread+0xe8/0x120 [ 10.129779] ? __pfx_kthread+0x10/0x10 [ 10.129779] ret_from_fork+0x34/0x50 [ 10.129779] ? __pfx_kthread+0x10/0x10 [ 10.129779] ret_from_fork_asm+0x1b/0x30 [ 10.129779] [ 10.129779] Modules linked in: [ 10.129779] CR2: 0000000000000010 [ 10.129779] ---[ end trace 0000000000000000 ]--- [ 10.129779] RIP: 0010:event_create_dir+0x29/0x5d0 [ 10.129779] Code: 90 41 57 41 56 41 55 41 54 49 89 f4 55 53 48 83 ec 18 48 8b 46 28 4c 8b 6e 10 48 c7 c6 71 0b 9a 87 48 89 7c 24 08 48 89 04 24 8b 45 10 48 8b 18 48 89 df e8 58 40 8e 00 85 c0 0f 84 ec 04 00 [ 10.129779] RSP: 0000:ffffa21d00093dd8 EFLAGS: 00010296 [ 10.129779] RAX: 0000000000000000 RBX: ffff8bc14020e1e0 RCX: ffff8bc140808080 [ 10.129779] RDX: 0000000000000000 RSI: ffffffff879a0b71 RDI: ffff8bc140442b40 [ 10.129779] RBP: ffffffff88155260 R08: ffff8bc140b6c060 R09: 0000000000038ee0 [ 10.129779] R10: ffff8bc140c3f080 R11: 006e776f64726165 R12: ffff8bc14020e1e0 [ 10.129779] R13: 0000000000000000 R14: ffff8bc1402ed405 R15: ffffffff8875c948 [ 10.129779] FS: 0000000000000000(0000) GS:ffff8bc1fbc00000(0000) knlGS:0000000000000000 [ 10.129779] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 10.129779] CR2: 0000000000000010 CR3: 000000003d220001 CR4: 00000000003706f0 [ 10.129779] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 10.129779] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 10.129779] note: kworker/u4:0[11] exited with irqs disabled
Wait I gave it a try a second time, it's now working, so ima_appraisal=off
, does allow the kexec to happen with SecureBoot disabled.
Specific image used was Ubuntu 23.10
EDIT : Network connectivity seems to be broken, I believe DHCP did not run a knew in order to get an IP address
(experimental, only tested for nixos-unstable) Static ip addresses and routes are restored after reboot. Interface that had dynamic addresses before are configured with DHCP and to accept prefixes from ipv6 router advertisement
The IP has been conserved but the DNS server probably need to be tweaked, I'm not sure, what is the default one, will edit if I have the answer.
EDIT 2 : Running the second kexec, in order to install NixOS (using grub) and the default example with some additional configuration led to an impossible to boot machine, it's stuck in HyperV's UEFI saying it found no suitable boot system.
I'll have to give a try with systemd-boot, I may also need to tweak disko's configuration.
So it seems ima_appraisal=off
is not even needed if SecureBoot is off, however the first kexec happen sucessfully :
+ init=/nix/store/nadvk7k5qam9iq19kshbk2c045hkd5q6-nixos-system-nixos-23.11pre-git/init + kernelParams=console=tty0 console=ttyS0,115200 loglevel=4 + readlink -f /root/kexec/kexec/run + dirname /root/kexec/kexec/run + SCRIPT_DIR=/root/kexec/kexec + TMPDIR=/root/kexec/kexec mktemp -d + INITRD_TMP=/root/kexec/kexec/tmp.mI4YwicutB + cd /root/kexec/kexec/tmp.mI4YwicutB + trap cleanup EXIT + mkdir -p ssh + extractPubKeys /root + home=/root + key=/root/.ssh/authorized_keys + test -e /root/.ssh/authorized_keys + grep -o \(\(ssh\|ecdsa\|sk\)-[^ ]* .*\) /root/.ssh/authorized_keys + key=/root/.ssh/authorized_keys2 + test -e /root/.ssh/authorized_keys2 + test -n root + sh -c echo ~root + sudo_home=/root + extractPubKeys /root + home=/root + key=/root/.ssh/authorized_keys + test -e /root/.ssh/authorized_keys + grep -o \(\(ssh\|ecdsa\|sk\)-[^ ]* .*\) /root/.ssh/authorized_keys + key=/root/.ssh/authorized_keys2 + test -e /root/.ssh/authorized_keys2 + test -e /etc/ssh/authorized_keys.d/root + test -n root + test -e /etc/ssh/authorized_keys.d/root + test -e /etc/ssh/ssh_host_dsa_key + cp -a /etc/ssh/ssh_host_dsa_key ssh + test -e /etc/ssh/ssh_host_dsa_key.pub + cp -a /etc/ssh/ssh_host_dsa_key.pub ssh + test -e /etc/ssh/ssh_host_ecdsa_key + cp -a /etc/ssh/ssh_host_ecdsa_key ssh + test -e /etc/ssh/ssh_host_ecdsa_key.pub + cp -a /etc/ssh/ssh_host_ecdsa_key.pub ssh + test -e /etc/ssh/ssh_host_ed25519_key + cp -a /etc/ssh/ssh_host_ed25519_key ssh + test -e /etc/ssh/ssh_host_ed25519_key.pub + cp -a /etc/ssh/ssh_host_ed25519_key.pub ssh + test -e /etc/ssh/ssh_host_rsa_key + cp -a /etc/ssh/ssh_host_rsa_key ssh + test -e /etc/ssh/ssh_host_rsa_key.pub + cp -a /etc/ssh/ssh_host_rsa_key.pub ssh + /root/kexec/kexec/ip --json addr + /root/kexec/kexec/ip -4 --json route + /root/kexec/kexec/ip -6 --json route + [ -f /etc/machine-id ] + cp /etc/machine-id machine-id + find . + gzip -9 + cpio -o -H newc 27 blocks + kexecSyscallFlags= + + sort -c -V uname -r + printf %s\n 6.1 6.5.0-1010-azure + kexecSyscallFlags=--kexec-syscall-auto + /root/kexec/kexec/kexec --load /root/kexec/kexec/bzImage --kexec-syscall-auto --initrd=/root/kexec/kexec/initrd --no-checks --command-line init=/nix/store/nadvk7k5qam9iq19kshbk2c045hkd5q6-nixos-system-nixos-23.11pre-git/init console=tty0 console=ttyS0,115200 loglevel=4 machine will boot into nixos in 6s... + echo machine will boot into nixos in 6s... + test -e /dev/kmsg + exec ssh: connect to host localhost port 22: Connection refused .... Endless repeat of the last line
On the VM, NixOS did kexec successfully and the ssh service is running :
[nixos@nixos:~]$ systemctl status sshd
● sshd.service - SSH Daemon
Loaded: loaded (/etc/systemd/system/sshd.service; enabled; preset: enabled)
Active: active (running) since Tue 2024-01-16 ; 14s ago
Process: 636 ExecStartPre=/nix/store/n7lpzrgsj5kmwsnm8fvv8cawr8qycym6-unit->
Main PID: 639 (sshd)
IP: 0B in, 0B out
IO: 1.3M read, 0B written
Tasks: 1 (limit: 4195)
Memory: 3.4M
CPU: 133ms
CGroup: /system.slice/sshd.service
└─639 "sshd: /nix/store/9fkxlh9gyxnb7bahc2rn0b5fhamgb63m-openssh-9>
nixos systemd[1]: Starting SSH Daemon...
nixos systemd[1]: Started SSH Daemon.
nixos sshd[639]: Server listening on 0.0.0.0 port 22.
nixos sshd[639]: Server listening on :: port 22.
By following some tips on https://github.com/nix-community/nixos-anywhere/issues/112 and also https://github.com/tiiuae/ghaf-infra/blob/main/docs/nixos-anywhere.md?plain=1#L138-L149
I was able to install NixOS using nixos-anywhere, I also add to use --post-kexec-ssh-port as the port wasn't the default one.
I will try to document the steps and create PR in the future.
~~EDIT : I'm still lacking internet connectivity despite being able to reach the virtual machine using ssh :thinking: (dns seems to be working fine)~~ (I was wrong everything works as intended) EDIT 2 : Also did the install with systemd-boot instead of grub.
I think nixos-anywhere could automate this kexec step as well if it detects a locked down kernel.
By lockdown you mean if IMA is configured and enabled ?
However for SecureBoot enabled machine we still don't have a solution yet, I believe the only way to have SecureBoot on Azure would probably to first contact Microsoft to know if there is a process.
But there should probably no way to nixos-anywhere unless we could sign the kernels with the key enrolled on the Azure machine.
Is IMA not the mechanism that is in place in case the machine was booted with secure boot?
I think IMA is kinda of an extension of SecureBoot to cover more files but on my test the machine, I did disable SecureBoot, I'll do some test with SecureBoot on and ima_appraise=off and report the result.
But so far SecureBoot off, IMA appraisal off worked.
Then with just SecureBoot off it should work out too.
Note : also sometimes the kexec seems to fail and the machine is kind of frozen after a nulle pointer dereference in the kernel and a CPU core seems just stuck
@Mic92 it seems that if SecureBoot is enabled, it is not possible to kexec.
$ cat /proc/cmdline
BOOT.... console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200 consoleblank=0 ima_appraise=off
See, even with ima_appraise=off
:
[ 60.022694] PEFILE: Unsigned PE binary
[ 60.024444] kexec_file: Enforced kernel signature verification failed (-61).
Also ima_appraisal
does not exist ? I only find ima_appraise=off
as valid online. So without SecureBoot maybe adding ima_appraise=off
is not needed.
Maybe it should be stated in the README that kexec doesn't work with secure boot
@usama8800 feel free to add it.