Upstream Linux-based stubdomain to Xen and Qemu
This is an effort to ease maintenance of Xen and stubdomain (reducing amount of custom patches needed to port to each new version) and also improve its stability by larger user base. This is also prerequisite for upstream to include new stubdomain in automated tests.
Xen part is already in progress, last posted version is here: https://xen.markmail.org/thread/m2gpedfrym2wcgan
Development Xen branch: https://github.com/marmarek/xen/tree/master-linux-stubdom
Development stubdomain branch (including Qemu patches, see qemu/patches dir): https://github.com/marmarek/qubes-vmm-xen-stubdom-linux/tree/debug
While the Xen efforts are already in progress, Qemu wasn't started yet. Some of the Qemu patches may need adjustments to be suitable upstream - at least should not break non-stubdomain case (currently some do).
I've started looking at upstreaming stubdom support to qemu. Hopefully I'll have something in the next week or two.
Has this already been done?
No. I made an attempt to upstream the main QEMU stubdom patches. There was some discussion, but no real conclusion. I didn't have a clear path forward and then other work took precedence.
Aside from PCI passthrough, you don't really need to patch QEMU to have it run in a stubdom. The -xen-stubdom patches aren't really necessary. Mainly they just disable adding xenstore watches since the stubdom isn't the backend for any PV devices. Upstream qemu didn't want a top-level -xen-stubdom switch.
0001-configure-add-enable-stubdom.patch 0002-xen-handle-CONFIG_STUBDOM.patch 0003-xen-hvm-handle-CONFIG_STUBDOM.patch
This hunk is no longer needed: https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux/blob/9bc88034cf80eadc2a1db4d5871c237310b4a2f5/qemu/patches/0003-xen-hvm-handle-CONFIG_STUBDOM.patch#L20-L30
HVM_PARAM_DM_DOMAIN was needed for the "default" ioreq server which has
been removed from Xen. Calling xen_get_ioreq_server_info identifies the
calling domain as an ioreq server for the target domain. This is the
identification that was formerly handled by HVM_PARAM_DM_DOMAIN.
In fact, after the removal of the default ioreq server, setting
HVM_PARAM_DM_DOMAIN returns an error... which wasn't checked.
To disable PV backends, an environment variable would be easier than playing with command line options. libxl doesn't know which command line options are supported by QEMU inside the stubdom. Having init in the stubdom mess with command line options is a little icky. init setting an environment variable is straight forward.
0007-Disable-NIC-option-ROM.patch
This makes missing romfiles non-fatal. Either the romfiles could be installed into the stubdom initramfs, or the relevant devices could have romfile="" added, IIRC.
For PCI passthrough:
This will hopefully go away with Marek's work.
0006-xen-round-pci-region-sizes.patch
There was another approach that had hvmloader handle the BARs. The issue is Xen can only do PAGE_SIZE mfn assignment, so BARs smaller than a page can't be placed together. This patch makes them PAGE_SIZE minimum, but there is some concern that is wrong. The hvmloader patch would ensure the BARs smaller than a page are not relocated into subpages (by forcing PAGE_SIZE alignment), but there is concern that the guest OS could still relocated and shoot itself in the foot.
0008-xen-fix-stubdom-PCI-addr.patch
I had a patch I need to find again and upstream that makes this work for stubdom and non-stubdom.
An equivalent (I think) was upstreamed as QEMU: commit be9c61da9fc57eb7d293f380d0805ca6f46c2657 Author: Chuck Zmudzinski [email protected] Date: Wed Jun 29 13:07:12 2022 -0400
xen/pass-through: merge emulated bits correctly
I think any other patches you all are carrying are Qubes-specific.
Hello!
I would like to apply linux stubdomains on Debian system. I build a bzImage and rootfs according to comment at https://old-list-archives.xen.org/archives/html/xen-devel/2020-05/msg01458.html .
To be able to build linux stubdomain I have to apply a patch:
diff --git a/Makefile.stubdom b/Makefile.stubdom
index cbe9049..ade29b1 100644
--- a/Makefile.stubdom
+++ b/Makefile.stubdom
@@ -50,7 +50,7 @@ endif
BUSYBOX_PATCHES := $(shell $Q -l busybox)
-build/busybox/.extracted: busybox-$(BUSYBOX_VERSION).tar.bz2
+build/busybox/.extracted: dl/busybox-$(BUSYBOX_VERSION).tar.bz2
rm -rf build/busybox
mkdir -p build/busybox
tar -C build/busybox --strip-components=1 -xf $<
@@ -75,7 +75,7 @@ build/busybox/busybox: build/busybox/config.status
PULSEAUDIO_PATCHES := $(shell $Q -l pulseaudio)
-build/pulseaudio/.extracted: pulseaudio-$(PULSEAUDIO_VERSION).tar.xz
+build/pulseaudio/.extracted: dl/pulseaudio-$(PULSEAUDIO_VERSION).tar.xz
rm -rf build/pulseaudio
mkdir -p build/pulseaudio
tar -C build/pulseaudio --strip-components=1 -xf $<
@@ -155,7 +155,7 @@ build/pulseaudio/config.status: build/pulseaudio/.patched build/pulseaudio/src/
build/padist/usr/local/bin/pulseaudio: build/pulseaudio/config.status
$(MAKE) $(MAKE_PARALLEL) install-strip -C build/pulseaudio DESTDIR=$(shell pwd)/build/padist
-build/libusb/.extracted: libusb-$(LIBUSB_VERSION).tar.bz2
+build/libusb/.extracted: dl/libusb-$(LIBUSB_VERSION).tar.bz2
rm -rf build/libusb
mkdir -p build/libusb
tar -C build/libusb --strip-components=1 -xf $<
@@ -181,11 +181,11 @@ build/qrexec/agent/qrexec-agent:
QEMU_PATCHES := $(shell $Q -l qemu)
-build/qemu/.extracted: qemu-$(QEMU_VERSION).tar.xz
+build/qemu/.extracted: dl/qemu-$(QEMU_VERSION).tar.xz
rm -rf build/qemu
mkdir -p build/qemu
tar -C build/qemu --strip-components=1 -xf $<
- rm build/qemu/pc-bios/*.{rom,bin,dtb} # remove prebuilt binaries
+ # rm build/qemu/pc-bios/*.{rom,bin,dtb} # remove prebuilt binaries
touch $@
build/qemu/.patched: build/qemu/.extracted qemu/patches/series $(QEMU_PATCHES)
@@ -289,7 +289,7 @@ build/qemu/build/qemu-system-i386: build/qemu/.patched build/qemu/build/config.s
LINUX_PATCHES := $(shell $Q -l linux)
-build/linux/.extracted: linux-$(LINUX_VERSION).tar
+build/linux/.extracted: dl/linux-$(LINUX_VERSION).tar
rm -rf build/linux
mkdir -p build/linux
tar -C build/linux --strip-components=1 -xf $<
diff --git a/rootfs/gen b/rootfs/gen
index 1bafdbe..ab09b2c 100755
--- a/rootfs/gen
+++ b/rootfs/gen
@@ -21,7 +21,7 @@ inst() {
"$DRACUT_INSTALL" -D "$rootfs_dir" -l "$@"
}
-mkdir -p "$rootfs_dir"/{bin,etc,proc/xen,sys,dev,tmp}
+mkdir -p "$rootfs_dir"/{bin,etc,proc/xen,sys,dev,tmp,share/qemu}
echo "Building initrd in $rootfs_dir"
inst build/busybox/busybox /bin/busybox
@@ -74,9 +74,9 @@ fi
make DESTDIR="$PWD/build/qemu/install" -C build/qemu/build install
inst build/qemu/install/usr/bin/qemu-system-i386 /bin/qemu
-inst build/qemu/build/pc-bios/optionrom/linuxboot_dma.bin /share/qemu/linuxboot_dma.bin
-cp build/qemu/pc-bios/vgabios-cirrus.bin "$rootfs_dir"/share/qemu/vgabios-cirrus.bin
-cp build/qemu/pc-bios/vgabios-stdvga.bin "$rootfs_dir"/share/qemu/vgabios-stdvga.bin
+inst build/qemu/pc-bios/linuxboot_dma.bin /share/qemu/linuxboot_dma.bin
+inst build/qemu/pc-bios/vgabios-cirrus.bin /share/qemu/vgabios-cirrus.bin
+inst build/qemu/pc-bios/vgabios-stdvga.bin /share/qemu/vgabios-stdvga.bin
inst xenstore-read /bin/xenstore-read
Then I tested it with config like this:
type = "hvm"
name = "debian"
memory = 1024
vcpus = 2
disk = [
'/tmp/images/debian-mini/debian.img,raw,xvda,rw',
'/tmp/images/debian-mini/debian.iso,raw,hdc,cdrom'
]
videoram = 128
vga = "stdvga"
vnc = 1
vncunused = 1
stubdomain_kernel="/tmp/build/linux/arch/x86/boot/bzImage"
stubdomain_ramdisk="/tmp/build/rootfs/stubdom-linux-rootfs"
device_model_stubdomain_override=1
device_model_version="qemu-xen"
device_model_override="/usr/libexec/xen-qemu-system-i386"
On run I get an error:
$ sudo xl create vm.hvm
Parsing config from vm.hvm
libxl: error: libxl_dm.c:2848:stubdom_xswait_cb: Domain 2010:Stubdom 2011 for 2010 startup: startup timed out
libxl: error: libxl_create.c:1939:domcreate_devmodel_started: Domain 2010:device model did not start: -9
libxl: error: libxl_xshelp.c:206:libxl__xs_read_mandatory: xenstore read failed: `/libxl/2010/type': No such file or directory
libxl: warning: libxl_dom.c:49:libxl__domain_type: unable to get domain type for domid=2010, assuming HVM
From qemu-dm log file (qemu-dm-debian.log):
+ add-fd /tmp/qemu.qmp /dev/fd/4 '{"execute":"qmp_capabilities","id":1}
{"execute":"add-fd", "arguments": { "fdset-id": 1 }, "id":42 }
' 42
{"QMP": {"version": {"qemu": {"micro": 2, "minor": 0, "major": 9}, "package": ""}, "capabilities": ["oob"]}}
xen be core: xs_mkdir device-model/2010/backends/vkbd: failed
xen be core: xs_mkdir device-model/2010/backends/vkbd: failed
{"QMP": {"version": {"qemu": {"micro": 2, "minor": 0, "major": 9}, "package": ""}, "capabilities": ["oob"]}}
GUI domain id not set before first surface allocation!
qemu: ../ui/console.c:520: qemu_create_displaysurface_from: Assertion `data != NULL' failed.
add-fd: read from socket: Connection reset by peer
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000100
CPU: 0 PID: 1 Comm: init Not tainted 6.6.44-xen-stubdom #1
Call Trace:
<TASK>
dump_stack_lvl+0x25/0x34
panic+0xf8/0x25b
do_exit+0x172/0x6e2
? handle_mm_fault+0x77/0xf9
do_group_exit+0x61/0x61
__x64_sys_exit_group+0xf/0xf
do_syscall_64+0x64/0x76
entry_SYSCALL_64_after_hwframe+0x4b/0xb5
RIP: 0033:0x7fa80006b209
Code: 00 4c 8b 05 f9 db 0f 00 be e7 00 00 00 ba 3c 00 00 00 eb 12 0f 1f 44 00 00 89 d0 0f 05 48 3d 00 f0 ff ff 77 1c f4 89 f0 0f 05 <48> 3d 00 f0 ff ff 76 e7 f7 d8 64 41 89 00 eb df 0f 1f 80 00 00 00
RSP: 002b:00007ffead45d438 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007ffead45df04 RCX: 00007fa80006b209
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000001
RBP: 0000000000000003 R08: ffffffffffffff80 R09: 0000000000000084
R10: e28049598e6cf04e R11: 0000000000000202 R12: 0000000000000000
R13: 00007ffead45d7e0 R14: 0000000000000000 R15: 00007fa8002c8020
</TASK>
Kernel Offset: disabled
I based on commit 0bbe9b8080850da7e48952b86de967d9839c870a .
Could You please help me with understanding whats happen? Thanks in advance!
There are two issues:
- Looks like vkbd support is broken. We have it disabled with a patch, but there is also an xl config option for that
xkb_device=0. - The implementation in https://github.com/QubesOS/qubes-vmm-xen-stubdom-linux (and @jandryuk 's fork too) relies on qubes-specific features. The crash above is related to missing qubes-gui-daemon in dom0 and config options related to it. I have a branch with those removed, but also the functionality is limited (for example no VGA access, nor keyboard/mouse).
vkb_device=0 but I think those lines are non-fatal - they are QEMU trying to indicate the presence of its backend in xenstore.
Some of my branches were tested with upstream Xen (https://github.com/jandryuk/qubes-vmm-xen-stubdom-linux/commits/qemu-upstream/ or https://github.com/jandryuk/qubes-vmm-xen-stubdom-linux/commits/uprevs-plus/) , but they are rather out of date now. Trying https://github.com/marmarek/qubes-vmm-xen-stubdom-linux/tree/for-upstream2 is probably your best first step.
@skvl you got domid up to 2010 - impressive :)
@marmarek , @jandryuk , thanks a lot! I will try.
you got domid up to 2010 - impressive :)
Yes , I run drakvuf few times :)
@marmarek , FYI
This change is not working in debian:
commit 976db274f7ead7b052ec13a36230e9667e74f2ef
Author: Marek Marczykowski-Górecki <[email protected]>
Date: Tue Feb 13 00:39:27 2024 +0100
Workaround broken ldconfig in Alpine
Alpine's ldconfig ignores all options, including -r. Enter the target
directory and use relative path, to work with both proper ldconfig and
the crippled one in Alpine. The proper one will issue a warning about
relative path, but for the stubdomain it is harmless.
diff --git a/rootfs/gen b/rootfs/gen
index 67bb981..a189a9a 100755
--- a/rootfs/gen
+++ b/rootfs/gen
@@ -100,7 +100,8 @@ mv "$rootfs_dir"/lib{.new,}
# possible leftovers from local $LD_LIBRARY_PATH + dracut-install
rm -rf "${rootfs_dir:?}"/home
touch "$rootfs_dir"/etc/ld.so.conf
-/sbin/ldconfig -r "$rootfs_dir" /lib
+# ldconfig in Alpine ignores -r
+(cd "$rootfs_dir"; /sbin/ldconfig -r "." ./lib)
ln -s lib "$rootfs_dir"/lib64
find "$rootfs_dir" -print0 | xargs -0 touch -ch -d @0
So I have tested https://github.com/marmarek/qubes-vmm-xen-stubdom-linux/tree/for-upstream2 and there is no panic! And no VGA as been proposed. But it looks like VM is alive.
What should be done to enable VGA and keyboard/mouse?
And other question: would it be possible to run drakvuf in such a domain?
As far as I understand it requires to pass VMI permissions to studom.
Thanks!
So I have tested https://github.com/marmarek/qubes-vmm-xen-stubdom-linux/tree/for-upstream2 and there is no panic! And no VGA as been proposed. But it looks like VM is alive.
What should be done to enable VGA and keyboard/mouse?
If this is for a desktop machine (human at the console), I recommend just running the Qubes OS GUI daemon on the host. It doesn’t have any dependencies besides standard X11 and Xen libraries and Qubes OS libvchan, so it should work outside of Qubes OS. This has vastly reduced attack surface compared to using a network protocol.
If this is a headless server, I recommend using QEMU’s built-in network graphics support. You can use a proxy somewhere else if you want to ensure that secrets (such as TLS private keys) are not exposed to the QEMU process.
It's inefficient, but I was able to manually get VNC working once. The stubdom needs to be rebuilt something like :
diff --git a/Makefile.stubdom b/Makefile.stubdom
index 6c751fe..070a0d8 100644
--- a/Makefile.stubdom
+++ b/Makefile.stubdom
@@ -82,7 +82,6 @@ build/qemu/build/config.status: build/qemu/.patched
--disable-guest-agent \
--audio-drv-list= \
--disable-smartcard \
- --disable-vnc \
--disable-spice \
--enable-trace-backends=log \
--disable-gnutls \
diff --git a/qemu/patches/series b/qemu/patches/series
index 7b4c201..584f9a2 100644
--- a/qemu/patches/series
+++ b/qemu/patches/series
@@ -4,3 +4,4 @@ round-pci-region-sizes.patch
disable-nic-option-rom.patch
0001-xen-Fix-host-pci-for-stubdom.patch
i386-load-kernel-on-xen-using-DMA.patch
+qemu-ui-con-NULL.patch
diff --git a/rootfs/gen b/rootfs/gen
index ef82c09..e3a9415 100755
--- a/rootfs/gen
+++ b/rootfs/gen
@@ -76,6 +76,7 @@ done
make DESTDIR="$PWD/build/qemu/build/install" -C build/qemu/build install
inst build/qemu/build/install/bin/qemu-system-i386 /bin/qemu
inst build/qemu/build/pc-bios/optionrom/linuxboot_dma.bin /share/qemu/linuxboot_dma.bin
+inst build/qemu/build/install/share/qemu/keymaps/en-us /share/qemu/keymaps/en-us
inst build/qemu/pc-bios/vgabios-cirrus.bin /share/qemu/vgabios-cirrus.bin
inst build/qemu/pc-bios/vgabios-stdvga.bin /share/qemu/vgabios-stdvga.bin
diff --git a/rootfs/init b/rootfs/init
index 82fbe9a..729e407 100755
--- a/rootfs/init
+++ b/rootfs/init
@@ -104,7 +104,8 @@ qemu -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol
-chardev pipe,path=/tmp/qmp/qemu,id=m -mon chardev=m,mode=control \
-chardev socket,server=on,wait=off,path=/tmp/qemu.qmp,id=m2 -mon chardev=m2,mode=control \
-chardev socket,server=on,wait=off,path=/tmp/qemu-cdrom.qmp,id=m-cdrom -mon chardev=m-cdrom,mode=control \
- $dm_args &
+ $dm_args \
+-display vnc=unix:/tmp/qemu.vnc &
set +f
unset IFS
@@ -132,6 +133,7 @@ mdev -d
# FIXME: this assume dom0 as toolstack domain
vchan-socket-proxy 0 $device_model/qmp-vchan /tmp/qemu.qmp &
+vchan-socket-proxy 0 $device_model/vnc-vchan /tmp/qemu.vnc &
while true; do
printf '==== Press enter for shell ====\n'
qemu-ui-con-NULL.patch is a patch for a runtime console error in the old QMEU version.
Then in dom0, you run something like:
sudo vchan-socket-proxy $stub_domid /local/domain/$stub_domid/device-model/$domid/vnc-vchan /tmp/vnc-4
Then you connect your vnc viewer to /tmp/vnc-4 as a unix socket, or use socat to switch it to a TCP socket.
Making libxl handle the dom0 side vchan-socket-proxy would be the right way to do it. I was just making a proof of concept and don't plan to pursue it further.
I can't comment on drakvuf. The stubdom is running from a ramdisk.