lxd icon indicating copy to clipboard operation
lxd copied to clipboard

LXD & Incus Windows 11 VMs work on Ubuntu 22.04 with limits.memory=16GiB - But FAIL on Ubuntu 24.04

Open bmullan opened this issue 1 year ago • 5 comments

HOST system

12 core 64GB ram btrfs file systems


Boot off an Ubuntu 22.04 SSD

Ubuntu 22.04

  • LXD version 5.21.1
  • incus version 6.1
  • Kernel 6.5.0.35

Create a new LXD Windows 11 VM using a backup LXD Windows11 VM image tarball

Create a new Incus Windows 11 VM using the Same backup LXD Windows11 VM image tarball

Set Both LXD and Incus VMs config.memory to 8GiB

$ lxc config set lxdwin11 limits.memory=8GiB $ incus config set incuswin11 limits.memory=8GiB

Start both the Incus and LXD Windows 11 VMs

BOTH work as expected !

Set Both LXD and Incus VMs config.memory to 16GiB

$ lxc config set lxdwin11 limits.memory=16GiB $ incus config set incuswin11 limits.memory=16GiB

Start both the Incus and LXD Windows 11 VMs

BOTH work as expected !


Using same HOST system.

Boot off an Ubuntu 24.04 SSD

Ubuntu 24.04

  • LXD version 5.21.1
  • incus version 6.0.0
  • Kernel 6.8.0.0-31

Create a new LXD Windows 11 VM again using the same backup LXD Windows11 VM image tarball

Create a new Incus Windows 11 VM again using the same backup LXD Windows11 VM image tarball

Set Both LXD and Incus VMs config.memory to 8GiB

$ lxc config set lxdwin11 limits.memory=8GiB $ incus config set incuswin11 limits.memory=8GiB

Start both the Incus and LXD Windows 11 VMs

BOTH work as expected !

Set Both LXD and Incus VMs config.memory to 16GiB

$ lxc config set lxdwin11 limits.memory=16GiB $ incus config set incuswin11 limits.memory=16GiB

This is where FAILURE occurs with both LXD and Incus Windows 11 VMs

Start both the Incus and LXD Windows 11 VMs

**Both LXD and Incus Windows 11 VMs with 16GB memory:

  • Start to boot
  • Get an IPv6 but no IPv4
  • After 10-15 seconds BOTH VMs exhibit max CPU Utilization
  • then BOTH VMs Terminate and enter the "Stopped" State"**

Something with Ubuntu 24.04 has changed that prevents configuring an LXD or Incus Windows 11 VM with limits.memory=16GiB while it works on Ubuntu 22.04.

bmullan avatar May 20 '24 01:05 bmullan

what does lxc info <instance> --show-log show for the running/stuck VMs?

Do you see anything in journalctl that may be useful?

Have you tried using 5.0/stable on 24.04 to see if it works for an additional data point?

tomponline avatar Jun 17 '24 09:06 tomponline

@simondeziel is this something you could try and confirm is an issue please?

tomponline avatar Jun 17 '24 09:06 tomponline

HOST system

12 core 64GB ram btrfs file systems

Question to self: could it be due to the automatic setting of size.state that is somehow very slow on Noble+btrfs?

simondeziel avatar Jun 19 '24 20:06 simondeziel

I'm unable to reproduce on Noble with a Ubuntu Noble VM:

root@hardhat:~# uname -a
Linux hardhat 6.8.0-35-generic #35-Ubuntu SMP PREEMPT_DYNAMIC Mon May 20 15:51:52 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
root@hardhat:~# snap list lxd
Name  Version      Rev    Tracking     Publisher   Notes
lxd   git-619ae69  28998  latest/edge  canonical✓  -

...
root@hardhat:~# lxc storage create butter btrfs source=/dev/sdd source.wipe=true


# Works
root@hardhat:~# lxc launch ubuntu-daily:24.04 -s butter -c limits.memory=8GiB --vm vu1
...
# Still works
root@hardhat:~# lxc config set vu1 limits.memory=16GiB

Now to try with 5.21/stable before switching to Windows 11 ISO.

simondeziel avatar Jun 19 '24 21:06 simondeziel

Same behavior with 5.21/stable using a Noble VM:

root@hardhat:~# lxc exec vu1 -- uptime
 21:55:34 up 1 min,  0 user,  load average: 0.13, 0.06, 0.02
root@hardhat:~# snap list lxd
Name  Version         Rev    Tracking     Publisher   Notes
lxd   5.21.1-d46c406  28460  5.21/stable  canonical✓  -
root@hardhat:~# lxc exec vu1 -- uptime
 21:55:44 up 1 min,  0 user,  load average: 0.11, 0.06, 0.02
root@hardhat:~# lxc exec vu1 -- free -gt
               total        used        free      shared  buff/cache   available
Mem:              15           0          15           0           0          15
Swap:              0           0           0
Total:            15           0          15

Will try Windows 11 ISO tomorrow.

simondeziel avatar Jun 19 '24 21:06 simondeziel

Sorry for the very long delay here, seems like "tomorrow" just happened ;) I tested with a Windows 11 VM backed by btrfs on LXD 5.21/stable and it worked with both 8GiB and 16GiB of RAM. In both RAM configs, the network remains functional after a few minutes of uptime and both IPv4 and IPv6 work.

Additional information from the system used to reproduce this:

root@joliet:~# uname -a
Linux joliet 6.8.0-39-generic #39-Ubuntu SMP PREEMPT_DYNAMIC Fri Jul  5 21:49:14 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
root@joliet:~# lxc config show win11
architecture: x86_64
config:
  limits.cpu: "4"
  limits.memory: 16GiB
  volatile.cloud-init.instance-id: e2a486fc-0c15-4158-926c-97a43009025d
  volatile.eth0.host_name: tap4bec6453
  volatile.eth0.hwaddr: 00:16:3e:41:10:aa
  volatile.last_state.power: RUNNING
  volatile.last_state.ready: "false"
  volatile.uuid: 11d7f840-c78e-4917-a62c-5d15f3de3050
  volatile.uuid.generation: 11d7f840-c78e-4917-a62c-5d15f3de3050
  volatile.vsock_id: "3642911216"
devices:
  install:
    boot.priority: "10"
    source: /home/ubuntu/Win11_23H2_English_x64v2+repack.iso
    type: disk
  root:
    path: /
    pool: butter
    size: 50GiB
    type: disk
  vtpm:
    path: /dev/tpm0
    type: tpm
ephemeral: false
profiles:
- default
stateful: false
description: ""
root@joliet:~# lxc storage show butter
name: butter
description: ""
driver: btrfs
status: Created
config:
  size: 30GiB
  source: /var/snap/lxd/common/lxd/disks/butter.img
used_by:
- /1.0/instances/win11
locations:
- none
root@joliet:~# lxc list
+-------+---------+--------------------+-----------------------------------------------+-----------------+-----------+
| NAME  |  STATE  |        IPV4        |                     IPV6                      |      TYPE       | SNAPSHOTS |
+-------+---------+--------------------+-----------------------------------------------+-----------------+-----------+
| win11 | RUNNING | 10.69.23.85 (eth0) | fd42:a7c9:7cdf:84d:9ce4:708f:9f10:dd7b (eth0) | VIRTUAL-MACHINE | 0         |
|       |         |                    | fd42:a7c9:7cdf:84d:216:3eff:fe41:10aa (eth0)  |                 |           |
+-------+---------+--------------------+-----------------------------------------------+-----------------+-----------+
root@joliet:~# snap list 
Name    Version         Rev    Tracking       Publisher   Notes
core22  20240419        1439   latest/stable  canonical✓  base
lxd     5.21.2-34459c8  29568  5.21/stable    canonical✓  -
snapd   2.63            21759  latest/stable  canonical✓  snapd

simondeziel avatar Aug 07 '24 21:08 simondeziel

It may have been fixed between when I posted this and you're reply here.

I was seeing the prob with LXD version 5.21.1 and you are using 5.21.5

The kernel and/or qemu might have also changed

Good to know whatever the root-cause that its fixed Thanks

bmullan avatar Aug 08 '24 00:08 bmullan