windows
windows copied to clipboard
Running docker in systemd-nspawn
I was trying to start the container in a systemd-nspawn container where /dev/kvm is bind mounted inside. Inside the jail i run kvm-ok:
root@debian:/mnt/safe/docker/compose/windows$ kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used
The Jail is the docker host with the following config:
version: "3"
services:
windows:
image: dockurr/windows
container_name: windows
devices:
- /dev/kvm
cap_add:
- NET_ADMIN
ports:
- 8006:8006
- 3389:3389/tcp
- 3389:3389/udp
stop_grace_period: 2m
restart: on-failure
volumes:
- /mnt/tank/all/kvm/win:/storage
But when i start the container the VM just shuts down:
1933312K ........ ........ ........ ........ 41% 65.1M 37s
1966080K ........ ........ ........ ........ 42% 69.9M 36s
1998848K ........ ........ ........ ........ 42% 33.9M 36s
...
4718592K ........ ........ ........ ........ 99% 69.3M 0s
4751360K ..... 100% 38.2M=62s
❯ Extracting Windows 11 bootdisk...
❯ Extracting Windows 11 environment...
❯ Extracting Windows 11 setup...
❯ Extracting Windows 11 image...
❯ Adding XML file for automatic installation...
❯ Building Windows 11 image...
❯ Creating a 64G growable disk image in raw format...
❯ Booting Windows using QEMU emulator version 8.2.1 ...
3h3h3hBdsDxe: failed to load Boot0002 "UEFI QEMU QEMU HARDDISK " from PciRoot(0x0)/Pci(0xA,0x0)/Scsi(0x0,0x0): Not Found
BdsDxe: loading Boot0001 "UEFI QEMU QEMU CD-ROM " from PciRoot(0x0)/Pci(0x5,0x0)/Scsi(0x0,0x0)
BdsDxe: starting Boot0001 "UEFI QEMU QEMU CD-ROM " from PciRoot(0x0)/Pci(0x5,0x0)/Scsi(0x0,0x0)
❯ Shutdown completed!
How can i get more logs or what do i miss? Is it not possible to run with bind mounted /dev/kvm or nspawn-jail? I also installed qemu-kvm inside the jail:
sudo apt -y install qemu-kvm libvirt-daemon bridge-utils virtinst libvirt-daemon-system
System: TrueNAS-Scale; Ryzen 1600x, SVM enabled JailOS: Debian Bookworm
Very strange.. Normally when it shuts down it should at least print an error message or the reason of shutting down. So its very hard to tell now whats happening.
I now created a new release (v2.05) which should provide more error info in your case. Can you please try and see if it reports more info? And if not, can you try setting:
environment:
CONSOLE: "Y"
in your compose file, and see if outputs more info?
Thanks for the effort, here are the results:
docker-compose:
version: "3"
services:
windows:
image: dockurr/windows
container_name: windows
devices:
- /dev/kvm
cap_add:
- NET_ADMIN
ports:
- 8006:8006
- 3389:3389/tcp
- 3389:3389/udp
stop_grace_period: 2m
restart: on-failure
environment:
MANUAL: "Y"
CONSOLE: "Y"
command: sleep infinity
Logs:
❯ Starting Windows for Docker v2.05...
❯ For support visit https://github.com/dockur/windows
❯ Downloading Windows 11...
[i] Downloading Windows media from official Microsoft servers...
[i] Downloading Windows 11...
[+] Got latest ISO download link (valid for 24 hours): https://software.download.prss.microsoft.com/dbazure/Win11_23H2_English_x64v2.iso?t=1122dfc9-36be-439d-b4b6-f16ca36b00a0&P1=1710254095&P2=601&P3=2&P4=1FloUyZlyxr%2f5wEBOrtHw8aWR4zK7kco4KoNsCL4eStk6tPD69rPy2Hadw6JVrzsutwGagcJyx3HtpN%2f36aYySC3PxmgeUoZE3q4yx0vrEcKa9iM%2bZYfatQoL77g64zVOzy9XLHBEM%2fuC4PnukGLCJyRGi%2fYHYQnoJzY74rrhmxlRZM8H%2f4CFqtgkU5yAPt7gA4JZc88c6YF3Rkesio20jG95nSvftrvsRdMl3tcloayvf4h1ezzxknQVKpZfkuQNGgOa61eY5j6Sdd80zBj%2fJ3BIY4H3m3Br%2bR36DHmAPdDZzDnUIGw87xgXw6o6z%2f1QMjCNZsxb%2fBpKEJrBmJeuw
# 1.4%
# 1.6%
# 1.7%
...
######################################################################## 100.0%
[+] Successfully downloaded Windows image!
❯ Extracting Windows 11 image...
❯ Building Windows 11 image...
❯ Creating a 64G growable disk image in raw format...
❯ Booting Windows using QEMU emulator version 8.2.1 ...
char device redirected to /dev/pts/0 (label serial0)
Weird, it doesn't even say now that is it shutting down. At the previous Version i could shortly see the same logs in the browser on port 8006. Now the container exits directly after "connecting to VNC" without the error messages.
Does the /dev/kvm` device need some special permissions? On Truenas Host/Nspawn Container and Docker container the device got the following permissions:
root@65350c78fe0e:/# ls -la /dev/kvm
crw-rw---- 1 root 104 10, 232 Mar 11 14:43 /dev/kvm
root@65350c78fe0e:/#
Thank you!
So it did some digging and found out that the the container also does need the device /dev/vhost-net
.
I found this while looking at a different docker-kvm project in the troubleshoot section:
https://github.com/BBVA/kvm?tab=readme-ov-file#notes--troubleshooting
So what i did was running bind mounting /dev/vhost-net
into the nspawn-container and then qemu was able to boot up.
The weird thing is when i enable the console for more debug like @kroese mentioned:
environment:
CONSOLE: "Y"
The VM would not start and only the following logs appeared when CONSOLE: "Y"
:
❯ Starting Windows for Docker v2.05...
❯ For support visit https://github.com/dockur/windows
❯ Booting Windows using QEMU emulator version 8.2.1 ...
char device redirected to /dev/pts/0 (label serial0)
... and when i remove it, it works but the logs are kind of empty:
No log line matching the '' filter
I guess there is some kind of bug with that environment variable. I don't know if this issue is only on my part with my special setup or this should be a separate issue.
@kiesstein Mmmh, very interesting find! In the past I had /dev/vhost-net
in the example compose file, but it seemed it was not necessary as the container can create this device automaticly via mknod
commands because of the NET_ADMIN
capability. So I removed /dev/vhost-net
from the compose file to keep it short.
Also its weird that it does not complain about anything when its created, but just exits much later when QEMU is launched.
And /dev/vhost-net
is completely optional unless you are using DHCP mode with macvlan. The default (bridge network) mode can also run without it, just with slightly less performance.
So I still dont completely understand what is going on in your case. But maybe I should just not create it automaticly, except in macvlan mode, so that in case it causes any problems it does not happen in bridge mode...
Food for thought!
You are right @kroese - it could not use mknot
to create the device because the nspawn container needs the rights to do so.
I removed the bind mount of the device and gave it the rights to rwm
of /dev/vhost-net
like in: systemd-nspawn container with '--property=DeviceAllow=/dev/vhost-net rwm'
and the windows VM boots up without issues!
Yes but the problem in this case is that mknod
didnt return any error on your system. If it would have failed, the script launches QEMU without vhost-net
and everything would have been fine because its optional.
But because mknod
returned succesfully, the script assumes the device is available and tries to use it.
In any case, I will make some changes and just disable vhost-net
unless somebody explicitly adds the device to their compose file.
I created a new tag (v2.06). Could you do me a favor and test if this version works in your original situation (where you did not mount /dev/vhost-net yet)? To see if the original issue is now solved.
Ok so I tried to reproduce the problem with 2.05
and tested with winxp
and I could not reproduce it.
I tried to install win11
again with the --property='DeviceAllow=/dev/vhost-net rwm'
but still does not work - mhm, maybe it is only an installation issue or previously I only tried with winxp
because it is faster to test - sadly I don't remember anymore.
So it is at least a win11
issue.
The statement that it works with --property='DeviceAllow=/dev/vhost-net rwm'
and win11
is false!
Next test was instead --property='DeviceAllow=/dev/vhost-net rwm'
, bind mounting --bind=/dev/vhost-net
, but sadly same issue.
Next was --capability=all
without --property='DeviceAllow=/dev/vhost-net rwm'
or --bind=/dev/vhost-net
-> not working.
Next was --capability=all
, --property='DeviceAllow=/dev/vhost-net rwm'
and --bind=/dev/vhost-net
-> not working.
I also did modprobe vhost_vsock
on the jail-host because I remember doing that, but no.
Then I tested all bind mounts listed in this + vhost-net
:
--bind=/dev/kvm --capability=all --bind=/dev/vhost-net --bind=/dev/fuse --bind=/dev/vsock --bind=/dev/vhost-vsock
-> not working.
I tried with winxp
with same settings to make sure this at least still works, and it did.
Then I also tried to bind vhost-net in the dockerfile with win11
again like:
devices:
- /dev/kvm
- /dev/vhost-net
And still does not work. So I am not sure anymore what all I did in my testing but I can't get it running anymore with 2.05
with win11
, and I do not remember exactly if it ever run with with11
- so there is that.
Then I tired with all the previous mentioned settings win10
and it did not work.
With win8
it did work! So something beginning with win10
is the issue.
I am sorry that I tested wrong (changing more than one setting(winxp)).
Then i did a docker compose pull
and restarting the container with win8
First boot did not work after recreation - but then i restarted the container again and it did boot. I don't know, maybe a fluke?
❯ Starting Windows for Docker v2.07...
❯ For support visit https://github.com/dockur/windows
❯ Booting Windows using QEMU emulator version 8.2.1 ...
3h3h3hBdsDxe: loading Boot0003 "Windows Boot Manager" from HD(1,GPT,3C9F9213-A787-474C-8772-B5A6D80C1E6B,0x800,0x40000)/\EFI\Microsoft\Boot\bootmgfw.efi
BdsDxe: starting Boot0003 "Windows Boot Manager" from HD(1,GPT,3C9F9213-A787-474C-8772-B5A6D80C1E6B,0x800,0x40000)/\EFI\Microsoft\Boot\bootmgfw.efi
❯ Shutdown completed!
❯ Starting Windows for Docker v2.07...
❯ For support visit https://github.com/dockur/windows
❯ Booting Windows using QEMU emulator version 8.2.1 ...
3h3h3hBdsDxe: loading Boot0003 "Windows Boot Manager" from HD(1,GPT,3C9F9213-A787-474C-8772-B5A6D80C1E6B,0x800,0x40000)/\EFI\Microsoft\Boot\bootmgfw.efi
BdsDxe: starting Boot0003 "Windows Boot Manager" from HD(1,GPT,3C9F9213-A787-474C-8772-B5A6D80C1E6B,0x800,0x40000)/\EFI\Microsoft\Boot\bootmgfw.efi
Then i tried win10
again with no bind-mount and so on, but is not working with v2.07
. It also does not work with all the bind mounts and privileged: true
and - /dev/vhost-net
in docker compose file.
In summary all the settings did nothing. Only thing i found out is that the it only works with winxp
and win8
(win7
, vista
, ... I did not test) and it does not work with win10
and win11
So I am fairly certain that the devices or bind mounts are not the problem - so sadly we are back to square one.
Can you try v3.05 while adding the privileged: true
setting to your compose file ? This will set the ignore_msrs
KVM parameter automaticly, which might solve your issue.