cloud-hypervisor
cloud-hypervisor copied to clipboard
Unable to create virtual machine with more than ~2.8G of RAM on arm64
Describe the bug
I have been unable to create virtual machine with more than ~2.8G of RAM on arm64. Happy to provide more details / dig into this further, but wanted to see if this was a known limitation first. The virtual machine boots successfully, but /proc/meminfo still only reports 2.8G of RAM`
To Reproduce
Start a virtual machine with --memory size=8G on an arm64 host. Run cat /proc/meminfo
Version cloud-hypervisor v40.0.0
VM configuration sudo cloud-hypervisor --serial tty --console pty --memory size=8G --kernel /var/lib/cloud-hypervisor/hypervisor-fw --disk path=/var/lib/cloud-hypervisor/jammy-server-cloudimg-arm64.raw
Guest OS version details: Ubuntu 22.04
Host OS version details: Ubuntu 22.04
@jinankjain have you seen this?
What if you do direct kernel boot? That helps us understand if the issue is in the firmware or the kernel.
No I haven't seen such a issue when trying to boot with MSHV.
No I haven't seen such a issue when trying to boot with MSHV.
Are you using direct kernel boot or hyperivsor-fw?
Direct kernel boot, hypervisor-fw does not boot yet with MSHV.
I just tested with direct kernel boot and was able to spin up a virtual machine with 720G of RAM. I will look deeper into rust-hypervisor-fw and see what I can find.
I added a debug log to rust-hypervisor-fw to print the fdt table. Looks like it is only detecting a single memory region entry
[INFO] Memory region 2048MiB@0x40000000
Adding similar parallel logs to cloud-hypervisor, I did verify that both fdt entries are being written
cloud-hypervisor: 26.946987ms: <vmm> INFO:arch/src/aarch64/fdt.rs:600 -- adding memory fdt entry 0x40000000 0x80000000
cloud-hypervisor: 26.964270ms: <vmm> INFO:arch/src/aarch64/fdt.rs:621 -- adding memory fdt entry 0x100000000 0xb380000000
EDIT: Found a good article by @michael2012z documenting fdt use in cloud-hypervisor!
It looks like there is no top level memory/ node to group memory regions as rust-hypervisor-fw expects. Both memory regions are being added as a top level node. It looks like this is allowed, so I think that we should update rust-hypervisor-fw to support the fdt that cloud-hypervisor is emitting rather than updated cloud-hypervisor. Should be a relatively easy fix to have rust-hypervisor-fw iterate through all top level memory nodes. Not sure if this actually affects the kernel's view of available memory regions.
cloud-hypervisor: 26.832355ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1089 -- memory@40000000/
cloud-hypervisor: 26.851647ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1126 -- device_type : "memory"
cloud-hypervisor: 26.876232ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1137 -- reg : [0, 40000000, 0, 80000000]
cloud-hypervisor: 26.909241ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1089 -- memory@100000000/
cloud-hypervisor: 26.928481ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1126 -- device_type : "memory"
cloud-hypervisor: 26.952781ms: <vmm> DEBUG:arch/src/aarch64/fdt.rs:1137 -- reg : [1, 0, B3, 80000000]
EDIT 2: It looks like the rust-hypervisor-firmware explicitly excludes the fdt table from efi configuration table on arm64 to force linux to use acpi information while booting. It seems reasonable that linux would be using fdt when doing direct kernel boot and acpi when booting from firmware. It is possible that cloud-hypervisor is reporting some incorrect acpi information on arm
EDIT 3: It looks like linux gets the memory map from the efi_get_memory_map function. This is implemented by rust-hypervisor-firmware using the bad fdt parsing logic noted above.
EDIT 4: Success! I will put out a PR
With my patch to rust-hypervisor-firmware, I was able to boot a virtual machine with up to 126 GiB of RAM
sudo cloud-hypervisor \
--serial tty \
--console pty \
--memory size=126G,hugepages=on,hugepage_size=1G \
--kernel /var/lib/cloud-hypervisor/hypervisor-fw \
--disk path=/var/lib/cloud-hypervisor/jammy-server-cloudimg-arm64.raw
However, I was still not able to boot a virtual machine with 127 GiB of RAM. The process hangs with 100% CPU on vcpu0 somewhere in the linux EFI boot stub.
[INFO] Memory region 2048MiB@0x40000000
[INFO] Memory region 128000MiB@0x100000000
[INFO] Booting with FDT
[INFO] Found PCI device vendor=8086 device=d57 in slot=0
[INFO] Found PCI device vendor=1af4 device=1043 in slot=1
[INFO] Found PCI device vendor=1af4 device=1042 in slot=2
[INFO] Found PCI device vendor=1af4 device=1044 in slot=3
[INFO] PCI Device: 0:2.0 1af4:1042
[INFO] Bar: type=MemorySpace32 address=0x2ff80000 size=0x80000
[INFO] Bar: type=MemorySpace32 address=0x0 size=0x0
[INFO] Bar: type=MemorySpace32 address=0x0 size=0x0
[INFO] Bar: type=MemorySpace32 address=0x0 size=0x0
[INFO] Bar: type=MemorySpace32 address=0x0 size=0x0
[INFO] Bar: type=MemorySpace32 address=0x0 size=0x0
[INFO] Updated BARs: type=MemorySpace32 address=2ff80000 size=80000
[INFO] Updated BARs: type=MemorySpace32 address=0 size=0
[INFO] Updated BARs: type=MemorySpace32 address=0 size=0
[INFO] Updated BARs: type=MemorySpace32 address=0 size=0
[INFO] Updated BARs: type=MemorySpace32 address=0 size=0
[INFO] Updated BARs: type=MemorySpace32 address=0 size=0
[INFO] Virtio block device configured. Capacity: 4612096 sectors
[INFO] Found EFI partition
[INFO] Filesystem ready
[WARN] Error loading default entry: File(NotFound)
[INFO] Using EFI boot.
[INFO] Found bootloader: \EFI\BOOT\BOOTAA64.EFI
[INFO] Executable loaded
Failed to set MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
Failed to set MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListTrustedRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported
Could not create variable: Unsupported
Failed to set MokListRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListRT: Unsupported
Failed to set MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListXRT: Unsupported
TPM logging failed: Unsupported
Could not create MokListTrustedRT: Unsupported
Something has gone seriously wrong: import_mok_state() failed: Unsupported
TPM logging failed: Unsupported
Fixed by https://github.com/cloud-hypervisor/rust-hypervisor-firmware/pull/346