multipass icon indicating copy to clipboard operation
multipass copied to clipboard

Multipass can't delete LXD instances that go into Error state

Open ricab opened this issue 2 years ago • 8 comments

Describe the bug When using the LXD backend, after a failed launch where the instance gets into an error state (112 from LXD, translated to unknown in multipass), multipass is unable to delete the instance:

$ multipass delete --purge --all
[2022-03-24T17:29:10.900] [error] [lxd request] Operation completed with error: (400) The instance cannot be cleanly shutdown as in Error status
delete failed: Operation completed with error: (400) The instance cannot be cleanly shutdown as in Error status

We would need to give LXD the equivalent of the --force flag in such cases.

To Reproduce Because of https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1935880 , a pretty reliable way to get a LXD vm in error state is just

  1. multipass launch --cpus 2
    • Until the remainder of #2250 is fixed, this will loop until timeout, leaving the instance in unknown state
  2. multipass delete --purge <instance> then fails as above

Expected behavior Multipass would be able to delete the instance.

ricab avatar Mar 24 '22 17:03 ricab

I had the same issue and restarting lxd via snap fixed the issue. Update: Actually while it initially seemed to work, I realised I was using multipass 1.9.1-rc.1+gcdd5686e, so removed it via snap, and installed the stable release 1.9.0. and I tried launching a 2G instance but it hanged on creating, then status was unknown, then the issue you just described happened again. The reason why I am running lxd driver is because I wanted to automatically assign an ip from my lan to every instance.

oscarmparedes avatar May 07 '22 07:05 oscarmparedes

the same problem happened to me today. And now, How can I remove the VM?

aatrcoutinho avatar Jun 02 '22 03:06 aatrcoutinho

Hi @aatrcoutinho, until we get this fixed, you can delete instances in LXD directly: lxc delete <instance-name> --project=multipass

ricab avatar Jul 26 '22 10:07 ricab

Hi @aatrcoutinho, until we get this fixed, you can delete instances in LXD directly: lxc delete <instance-name> --project=multipass

Lifesaver !! This worked for me too:

lxc delete deb12 --project=multipass --force

Huge Thanks @ricab as Multipass accidentally does serious damage (blocking deletion of the VM, and even blocking the host machine from rebooting) when Multipass effectively loses control of VM's like this one:

multipass launch -n deb12 https://cloud.debian.org/images/cloud/bookworm/daily/latest/debian-12-generic-amd64-daily.qcow2

CLARIF: The above worked in recent weeks, but no longer works today, painfully freezing out many/most all Multipass actions. Even snap restart multipass.multipassd does not help. Not even a forced reboot of the entire host PC helped.

Hence the need to force the deletion of a Multipass VM using LXD. As Multipass itself is unfortunately not capable of reliably deleting its own VM's:

# multipass info deb12
info failed: ssh connection failed: 'Connection refused'

# multipass stop deb12
Stopping deb12 \[2022-11-05T12:19:34.936] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operations/94e4dd5b-67a6-47d1-ae91-1083e0e96044/wait?project=multipass
[2022-11-05T12:19:34.937] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operationstop failed: Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operations/94e4dd5b-67a6-47d1-ae91-1083e0e96044/wait?project=multipass

# multipass delete deb12
[2022-11-05T12:35:55.949] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass
[2022-11-05T12:35:55.949] [error] [lxd request] Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass
delete failed: Timeout getting response for GET operation on unix://multipass/var/snap/lxd/common/lxd/[email protected]/operations/8d951713-031f-461f-8512-93d907fa1d09/wait?project=multipass

This lifesaving trick should be better documented within https://multipass.run/docs e.g. at remove-an-instance or some similar space, Thanks All!

Background:

  • iiab/iiab#3399

holta avatar Nov 05 '22 17:11 holta

To CLARIFY the above, even this very latest Edge Channel version of Multipass cannot reliably delete its own VM's / instances:

# multipass --version
multipass   1.12.0-dev.379+g6cfdc875
multipassd  1.12.0-dev.379+g6cfdc875

# snap info multipass
...
  latest/edge:      1.12.0-dev.379+g6cfdc875 2022-10-30 (8154) 112MB -
installed:          1.12.0-dev.379+g6cfdc875            (8154) 112MB -

holta avatar Nov 05 '22 17:11 holta

@holta, Multipass is designed to launch Ubuntu VMs, not Debian. You can try, of course, but you're on your own then. It might work, but it is no surprise that it doesn't.

ricab avatar Nov 08 '22 12:11 ricab

Just FYI Debian 12 instances worked fine with Multipass in the past.

The Much Larger Point (this ticket!) is that Multipass accidentally sabotages the host PC (preventing the "reboot" command from working, etc) when any instance (Debian or Ubuntu or whatever, that's not the point) encounters everyday such testing errors.

So your original request (thank you to @ricab) is quite important: Multipass indeed needs to evolve to be able to delete dysfunctional instances -- according to the instructions it provides, which for the moment is: https://multipass.run/docs/remove-an-instance

holta avatar Nov 08 '22 13:11 holta

Just FYI this bug (or something extremely similar!) has become substantially more severe in the past ~10 days (seemingly manifesting in new ways, e.g. known-to-be-reliable Ubuntu instances often-and-intermittently cannot boot). I don't know why.

Certainly known-to-be-reliable Multipass instances (e.g. Ubuntu 22.04, Ubuntu 23.04 pre-releases, and others) frequently fail to start. FYI the Host PC is Ubuntu Server 22.04.

The only workaround I've found so far is to repeatedly reboot the Host PC (rebooting it twice is sometimes necessary) until the Multipass instance in question...finally boots cleanly.

CLARIF 1: Running sudo snap restart multipass is not enough to resolve the problem. Full reboot(s) of the Host PC unfortunately appear to be the only way forward, so far.

CLARIF 2: I just reverted from Multipass's "edge" channel (1.12.x) to its "beta" channel (1.11.1) to see if that might somewhat help going forward?

# multipass version
multipass   1.11.1
multipassd  1.11.1

# lxd version
5.12

holta avatar Apr 02 '23 22:04 holta