multipass icon indicating copy to clipboard operation
multipass copied to clipboard

Cannot spin up any VMs with multipass after a power outage.

Open itellaetxe opened this issue 3 months ago • 8 comments

Describe the bug I cannot spin up any VMs with multipass after a power outage.

To Reproduce How, and what happened?

  1. Launched a VM in a server.
  2. It worked for 3 months nonstop, without any problems.
  3. A power outage made the host restart.
  4. When I tried to start the VM back, it gets stuck in "Starting" and gets timed out every time, and goes into "Unknown" state. I can put it in the "Stopped" state, but I cannot start it properly.

Find the logs below.

Logs

Sep 22 14:02:06 nekane multipassd[25616]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 22 14:02:06 nekane multipassd[25616]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 22 14:06:18 nekane multipassd[25616]: Returning setting local.driver=qemu
Sep 22 14:08:11 nekane multipassd[896]: fetch manifest periodically
Sep 22 14:11:25 nekane multipassd[25616]: fetch manifest periodically
Sep 22 14:14:11 nekane multipassd[25616]: Loading "anbox-cloud-appliance" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "charm-dev" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "docker" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "jellyfin" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "minikube" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "ros2-humble" v1
Sep 22 14:14:11 nekane multipassd[25616]: Loading "ros2-jazzy" v1
Sep 22 14:15:08 nekane multipassd[25616]: SSH session disconnected
Sep 22 14:15:08 nekane multipassd[25616]: Could not broadcast shutdown message in VM: SSH unavailable on instance test-vm: not running
Sep 22 14:15:08 nekane multipassd[25616]: QMP: {"timestamp": {"seconds": 1758543308, "microseconds": 962274}, "event": "POWERDOWN"}
Sep 22 14:15:08 nekane multipassd[25616]: VM powering down
Sep 22 14:15:08 nekane multipassd[25616]: QMP: {"return": {}}
Sep 22 14:15:13 nekane multipassd[25616]: QMP: {"timestamp": {"seconds": 1758543313, "microseconds": 129234}, "event": "SHUTDOWN", "data": {"guest": true, "reason": "guest-shutdown"}}
Sep 22 14:15:13 nekane multipassd[25616]: VM shut down
Sep 22 14:15:13 nekane multipassd[25616]: process state changed to NotRunning
Sep 22 14:15:13 nekane multipassd[25616]: process finished with exit code 0
Sep 22 14:15:13 nekane multipassd[25616]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 22 14:16:38 nekane multipassd[25616]: process working dir '/snap/multipass/15737/qemu'
Sep 22 14:16:38 nekane multipassd[25616]: process program 'qemu-system-x86_64'
Sep 22 14:16:38 nekane multipassd[25616]: process arguments '-bios, OVMF.fd, --enable-kvm, -cpu, host, -nic, tap,ifname=tap-94c68a5fa3a,script=no,downscript=no,model=virtio-net-pci,mac=52:54:00:41:47:51, -device, virtio-scsi-pci,id=scsi0, -drive, file=/var/snap/multipass/common/data/multipassd/vault/instances/test-vm/ubuntu-22.04-server-cloudimg-amd64.img,if=none,format=qcow2,discard=unmap,id=hda, -device, scsi-hd,drive=hda,bus=scsi0.0, -smp, 1, -m, 1024M, -qmp, stdio, -chardev, null,id=char0, -serial, chardev:char0, -nographic, -cdrom, /var/snap/multipass/common/data/multipassd/vault/instances/test-vm/cloud-init-config.iso'
Sep 22 14:16:38 nekane multipassd[25616]: [27327] started: qemu-system-x86_64 -nographic -dump-vmstate /tmp/multipassd.bxXhmd
Sep 22 14:16:38 nekane multipassd[25616]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 22 14:16:38 nekane multipassd[25616]: Applied AppArmor policy: multipass.test-vm.qemu-system-x86_64
Sep 22 14:16:38 nekane multipassd[25616]: process state changed to Starting
Sep 22 14:16:38 nekane multipassd[25616]: process state changed to Running
Sep 22 14:16:38 nekane multipassd[25616]: [27329] started: qemu-system-x86_64 -bios OVMF.fd --enable-kvm -cpu host -nic tap,ifname=tap-94c68a5fa3a,script=no,downscript=no,model=virtio-net-pci,mac=52:54:00:41:47:51 -device virtio-scsi-pci,id=scsi0 -drive file=/var/snap/multipass/common/data/multipassd/vault/instances/test-vm/ubuntu-22.04-server-cloudimg-amd64.img,if=none,format=qcow2,discard=unmap,id=hda -device scsi-hd,drive=hda,bus=scsi0.0 -smp 1 -m 1024M -qmp stdio -chardev null,id=char0 -serial chardev:char0 -nographic -cdrom /var/snap/multipass/common/data/multipassd/vault/instances/test-vm/cloud-init-config.iso
Sep 22 14:16:38 nekane multipassd[25616]: process started
Sep 22 14:16:38 nekane multipassd[25616]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 22 14:16:38 nekane multipassd[25616]: QMP: {"QMP": {"version": {"qemu": {"micro": 1, "minor": 2, "major": 8}, "package": ""}, "capabilities": ["oob"]}}
Sep 22 14:16:38 nekane multipassd[25616]: Waiting for SSH to be up
Sep 22 14:16:38 nekane multipassd[25616]: QMP: {"return": {}}
Sep 22 14:16:40 nekane multipassd[25616]: QMP: {"timestamp": {"seconds": 1758543400, "microseconds": 744786}, "event": "RTC_CHANGE", "data": {"offset": 0, "qom-path": "/machine/unattached/device[4]/rtc"}}
Sep 22 14:16:41 nekane multipassd[25616]: QMP: {"timestamp": {"seconds": 1758543400, "microseconds": 800641}, "event": "RTC_CHANGE", "data": {"offset": 0, "qom-path": "/machine/unattached/device[4]/rtc"}}
Sep 22 14:16:51 nekane multipassd[25616]: QMP: {"timestamp": {"seconds": 1758543411, "microseconds": 687636}, "event": "NIC_RX_FILTER_CHANGED", "data": {"path": "/machine/unattached/device[22]/virtio-backend"}}

It always gets stuck here.

Additional info

  • OS: Linux 6.1.0-39-amd64 SMP PREEMPT_DYNAMIC DebianGNU/Linux 12 (bookworm) 6.1.148-1 (2025-08-26) x86_64
  • multipass version: The problem started with 1.16.1, I switched to the edge channel to get 1.17.0-dev.456+gcb628d61, but it persists.
  • multipass info: I deleted all VMs.
  • multipass get local.driver: qemu

itellaetxe avatar Sep 22 '25 12:09 itellaetxe

Hi @itellaetxe, are you able to still launch new VMs? Have you already deleted your old VMs?

sharder996 avatar Sep 22 '25 13:09 sharder996

Hi @itellaetxe, have a look at our troubleshooting docs for this sort of situation: https://documentation.ubuntu.com/multipass/latest/how-to-guides/troubleshoot/troubleshoot-launch-start-issues/

Please let us know what you find.

ricab avatar Sep 22 '25 14:09 ricab

Thank you @sharder996 and @ricab for the suggestions. In line:

are you able to still launch new VMs? Have you already deleted your old VMs?

No, I am not able to launch any new VMs, and yes, I already deleted my old VMs.

Hi @itellaetxe, have a look at our troubleshooting docs for this sort of situation

Also, as @ricab suggested, I had a look at the troubleshooting docs but I could not fix my issue. Inside journalctl I see the "NIX_RX_FILTER_CHANGED" event, so I tried monitoring the DHCP requests. The requests are there, but there are no replies, so I jumped to the No IP assigned section. The thing is, I am using a Debian machine and the guide only covers macOS, but the section that applies to me suggests launching the DHCP server (because well, I do not have a macOS firewall). Naturally, instead of giving up, I tried doing it in Debian with the kea DHCP server. After seeing that the kea-dhcp4-server.service service is running:

Sep 29 08:19:52 nekane systemd[1]: Started kea-dhcp4-server.service - Kea IPv4 DHCP daemon.
Sep 29 08:19:52 nekane kea-dhcp4[114413]: 2025-09-29 08:19:52.120 INFO  [kea-dhcp4.dhcp4/114413.140117685776896] DHCP4_STARTING Kea DHCPv4 server version 2.2.0 (stable) starting
Sep 29 08:19:52 nekane kea-dhcp4[114413]: 2025-09-29 08:19:52.121 WARN  [kea-dhcp4.dhcp4/114413.140117685776896] DHCP4_CONFIG_SYNTAX_WARNING configuration syntax warning: /etc/kea/kea-dhcp4.conf:436.39: Extrane>
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  HOSTS_BACKENDS_REGISTERED the following host backend types are available: mysql postgresql
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCPSRV_CFGMGR_SOCKET_TYPE_DEFAULT "dhcp-socket-type" not specified , using default socket type raw
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCPSRV_CFGMGR_NEW_SUBNET4 a new subnet has been added to configuration: 192.0.2.0/24 with params: t1=900, t2=1800, valid-lifetime=3600
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  COMMAND_ACCEPTOR_START Starting to accept connections via unix domain socket bound to /run/kea/kea4-ctrl-socket
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCP4_CONFIG_COMPLETE DHCPv4 server has completed configuration: added IPv4 subnets: 1; DDNS: disabled
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCPSRV_MEMFILE_DB opening memory file lease database: lfc-interval=3600 type=memfile universe=4
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCPSRV_MEMFILE_LEASE_FILE_LOAD loading leases from file /var/lib/kea/kea-leases4.csv
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCPSRV_MEMFILE_LFC_SETUP setting up the Lease File Cleanup interval to 3600 sec
Sep 29 08:19:52 nekane kea-dhcp4[114413]: WARN  DHCPSRV_NO_SOCKETS_OPEN no interface configured to listen to DHCP traffic
Sep 29 08:19:52 nekane kea-dhcp4[114413]: WARN  DHCP4_MULTI_THREADING_INFO enabled: no, number of threads: 0, queue size: 0
Sep 29 08:19:52 nekane kea-dhcp4[114413]: INFO  DHCP4_STARTED Kea DHCPv4 server version 2.2.0 started

I monitor the DHCP traffic, but there are no replies:

~$ sudo tcpdump -i mpqemubr0 udp port 67 and port 68
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on mpqemubr0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
08:29:05.889845 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:41:47:51 (oui Unknown), length 279
08:29:09.609340 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:41:47:51 (oui Unknown), length 279
08:29:14.416862 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from 52:54:00:41:47:51 (oui Unknown), length 279

When it was set up, the Debian host was configured to have a static IP in the local network, could this be interfering with assigning an IP address to the VM? I am the opposite of an expert in this field, so I really do not know if this is even relevant to the problem.

itellaetxe avatar Sep 29 '25 06:09 itellaetxe

Hi @itellaetxe, thanks for getting back with this info. On Linux, with QEMU, Multipass manages it's own DHCP server, with a program called dnsmasq, so you don't need to launch your own. The Multipass daemon — multipassd — runs that separate dnsmasq process, associated with the mpqemubr0 bridge, and with a random subnet to assign IPs in. It also creates a number of iptables rules to connect everything and NAT VM traffic.

In your case, something is either broken or being blocked in that scheme. It could well be that your custom configuration for the static IP could be interfering. How did you set it up? Are you just assigning a static IP to the correct MAC address in your router, or do you have configurations for it on your machine?

Other possibilities: do you have any custom firewall rules? Other programs that could have installed firewall rules or other firewall programs active (e.g. ufw)? Disconnected snap interfaces? Do you see a dnsmasq running?

ricab avatar Sep 29 '25 11:09 ricab

Hi @ricab, thank you for your swift reply.

The static IP was set as shown in the /etc/network/interfaces file:

# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo
iface lo inet loopback

# eno1
auto eno1
iface eno1 inet static
  address 192.168.1.93
  netmask 255.255.254.0
  gateway 192.168.1.1
  dns-nameservers 8.8.4.4 8.8.8.8

Pretty simple I think, I do not know how this could interfere. It had to be set up like this, instead of in the router side (research center policies...)

Now, to check if dnsmasq is running correctly, I restarted multipass by running sudo snap restart multipass, and I checked the logs with sudo journalctl --unit="snap.multipass*, as the documentation suggests. I see a permissions problem with dnsmasq.hosts, which cannot be read. See the attached log (line starting with "**"):

Sep 29 15:43:31 nekane multipassd[120763]: Starting Multipass 1.17.0-dev.458+g33e002d6
Sep 29 15:43:31 nekane multipassd[120763]: Daemon arguments: /snap/multipass/15782/bin/multipassd --verbosity debug --logger platform
...skipping...
Sep 29 15:43:30 nekane multipassd[120763]: [120865] started: iptables-nft --wait -t nat --insert POSTROUTING --source 10.2.83.0/24 ! --destination 10.2.83.0/24 --protocol udp --jump MASQUERADE --to-ports 1024-6>
Sep 29 15:43:30 nekane multipassd[120763]: [120866] started: iptables-nft --wait -t nat --insert POSTROUTING --source 10.2.83.0/24 ! --destination 10.2.83.0/24 --jump MASQUERADE --match comment --comment genera>
Sep 29 15:43:30 nekane multipassd[120763]: [120867] started: iptables-nft --wait -t filter --insert FORWARD --destination 10.2.83.0/24 --out-interface mpqemubr0 --match conntrack --ctstate RELATED,ESTABLISHED ->
Sep 29 15:43:30 nekane multipassd[120763]: [120868] started: iptables-nft --wait -t filter --insert FORWARD --source 10.2.83.0/24 --in-interface mpqemubr0 --jump ACCEPT --match comment --comment generated for M>
Sep 29 15:43:30 nekane multipassd[120763]: [120869] started: iptables-nft --wait -t filter --insert FORWARD --in-interface mpqemubr0 --out-interface mpqemubr0 --jump ACCEPT --match comment --comment generated f>
Sep 29 15:43:30 nekane multipassd[120763]: [120870] started: iptables-nft --wait -t filter --append FORWARD --in-interface mpqemubr0 --jump REJECT --reject-with icmp-port-unreachable --match comment --comment g>
Sep 29 15:43:30 nekane multipassd[120763]: [120871] started: iptables-nft --wait -t filter --append FORWARD --out-interface mpqemubr0 --jump REJECT --reject-with icmp-port-unreachable --match comment --comment >
Sep 29 15:43:31 nekane dnsmasq[120849]: started, version 2.90 cachesize 150
Sep 29 15:43:31 nekane dnsmasq[120849]: compile time options: IPv6 GNU-getopt DBus no-UBus i18n IDN2 DHCP DHCPv6 no-Lua TFTP conntrack ipset no-nftset auth cryptohash DNSSEC loop-detect inotify dumpfile
Sep 29 15:43:31 nekane dnsmasq-dhcp[120849]: DHCP, IP range 10.2.83.2 -- 10.2.83.254, lease time infinite
Sep 29 15:43:31 nekane dnsmasq-dhcp[120849]: DHCP, sockets bound exclusively to interface mpqemubr0
Sep 29 15:43:31 nekane dnsmasq[120849]: using only locally-known addresses for multipass
Sep 29 15:43:31 nekane dnsmasq[120849]: reading /etc/resolv.conf
Sep 29 15:43:31 nekane dnsmasq[120849]: using nameserver (hidden for privacy).#53
Sep 29 15:43:31 nekane dnsmasq[120849]: using nameserver (hidden for privacy)#53
Sep 29 15:43:31 nekane dnsmasq[120849]: using only locally-known addresses for multipass
Sep 29 15:43:31 nekane dnsmasq[120849]: read /etc/hosts - 7 names
**Sep 29 15:43:31 nekane dnsmasq[120849]: cannot read /var/snap/multipass/common/data/multipassd/network/dnsmasq.hosts: Permission denied**
Sep 29 15:43:31 nekane multipassd[120763]: Loading "anbox-cloud-appliance" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "charm-dev" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "docker" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "jellyfin" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "minikube" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "ros2-humble" v1
Sep 29 15:43:31 nekane multipassd[120763]: Loading "ros2-jazzy" v1
Sep 29 15:43:31 nekane multipassd[120763]: Certificates for the gRPC server (root: /var/snap/multipass/common/data/multipassd/multipass_root_cert.pem, subordinate: /var/snap/multipass/common/data/multipassd/cer>
Sep 29 15:43:31 nekane multipassd[120763]: Re-using existing certificates for the gRPC server
Sep 29 15:43:31 nekane multipassd[120763]: gRPC listening on unix:/var/snap/multipass/common/multipass_socket
Sep 29 15:43:31 nekane multipassd[120763]: fetch manifest periodically
Sep 29 15:43:31 nekane multipassd[120763]: Applied AppArmor policy: multipass.qemu-img
Sep 29 15:43:31 nekane multipassd[120763]: [120914] started: qemu-img snapshot -l /var/snap/multipass/common/data/multipassd/vault/instances/test-vm/ubuntu-22.04-server-cloudimg-amd64.img
Sep 29 15:43:31 nekane multipassd[120763]: Saved file `/var/snap/multipass/common/data/multipassd/multipassd-vm-instances.json` successfully in attempt #1
Sep 29 15:43:31 nekane multipassd[120763]: Saved file `/var/snap/multipass/common/cache/multipassd/vault/multipassd-image-records.json` successfully in attempt #1
Sep 29 15:43:31 nekane multipassd[120763]: Starting Multipass 1.17.0-dev.458+g33e002d6
Sep 29 15:43:31 nekane multipassd[120763]: Daemon arguments: /snap/multipass/15782/bin/multipassd --verbosity debug --logger platform

Could this be the problem?

As for firewall rules, it is unlikely that anything set up any rules, but I will check it.

itellaetxe avatar Sep 29 '25 13:09 itellaetxe

Hmm, no I don't think that dnsmasq.hosts file is the problem. That's a benign log that always shows up. We have https://github.com/canonical/multipass/issues/4108 to track it, but it shouldn't interfere (unless you added anything to it?)

I am not sure how that network file gets rendered, but would it override multipass's setup? Also, I see you have custom DNS (redacted in your logs, but shows in the config as generic ones), so did you perhaps forget something else that could be affecting connectivity?

It is most likely a firewall blocking DHCP or bridge connectivity issues. A few commands you can use to investigate further:

iptables -L -n -v | grep -i dhcp
brctl show
ip -br link

I suspect the issue is somewhere in your custom network configuration, but please let us know if you find evidence that Multipass is at fault.

ricab avatar Oct 01 '25 12:10 ricab

Unsure whether it is a related case, but it seems like ufw blocked access to my multipass VMs, as journal shows ufw block logs, as shown below:

Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=tap-4e59abb469f OUT= LEN=73 TC=0 HOPLIMIT=255 FLOWLBL=526246 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= EN=73 TC=0 HOPLIMIT=255 FLOWLBL=53115 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= SRC=10.111.103.1 DST=224.0.0.252 LEN=53 TOS=0x00 PREC=0x00 TTL=255 ID=7388 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=tap-4e59abb469f OUT= MAC= LEN=73 TC=0 HOPLIMIT=255 FLOWLBL=526246 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= SRC=10.111.103.1 DST=224.0.0.252 LEN=53 TOS=0x00 PREC=0x00 TTL=255 ID=7405 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= LEN=73 TC=0 HOPLIMIT=255 FLOWLBL=53115 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=tap-4e59abb469f OUT= MAC= LEN=73 TC=0 HOPLIMIT=255 FLOWLBL=526246 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= LEN=73 TC=0 HOPLIMIT=255 FLOWLBL=53115 PROTO=UDP SPT=5355 DPT=5355 LEN=33 
Nov 17 14:37:03 fuecoco kernel: [UFW BLOCK] IN=mpqemubr0 OUT= MAC= SRC=10.111.103.1 DST=224.0.0.252 LEN=53 TOS=0x00 PREC=0x00 TTL=255 ID=7442 PROTO=UDP SPT=5355 DPT=5355 LEN=33 

I added sudo ufw allow in on mpqemubr0 and this seems to be working for me.

medicalwei avatar Nov 17 '25 06:11 medicalwei

Yeah, thanks @medicalwei. We should have this in our docs.

ricab avatar Nov 17 '25 12:11 ricab