firecracker
firecracker copied to clipboard
Cannot boot a 32MB VM
VMs cannot be booted with 32MB (or lower) as the memory size. It does work with >= 33MB though.
Steps to reproduce:
# In one terminal
./firecracker --api-sock /tmp/firecracker.socket
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/boot-source' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"kernel_image_path": "./hello-vmlinux.bin",
"boot_args": "console=ttyS0 reboot=k panic=1 pci=off"
}'
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/drives/rootfs' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"drive_id": "rootfs",
"path_on_host": "./hello-rootfs.ext4",
"is_root_device": true,
"is_read_only": false
}'
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/machine-config' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"vcpu_count": 1,
"mem_size_mib": 32
}'
curl --unix-socket /tmp/firecracker.socket -i \
-X PUT 'http://localhost/actions' \
-H 'Accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"action_type": "InstanceStart"
}'
At this point, I'd expect the VM to boot, but nothing happens.
The logs look like this (I was adding a network interface and using different boot args for these logs, but I confirmed the issue with the reproduction steps above):
==> logs.fifo <==
Running Firecracker v0.15.2
2019-04-03T11:21:41.321693167 [anonymous-instance] The synchronous Put request on "/logger" with body "{\n \"log_fifo\": \"logs.fifo\",\n \"metrics_fifo\": \"metrics.fifo\",\n \"level\": \"Info\"\n }" was executed successfully. Status code: 204 No Content.
2019-04-03T11:21:41.328947750 [anonymous-instance] The API server received a synchronous Put request on "/boot-source" with body "{\n \"kernel_image_path\": \"./hello-vmlinux.bin\",\n \"boot_args\": \"console=ttyS0 reboot=k panic=1 pci=off ip=172.17.100.10::172.17.100.1::myvm:eth0:on:8.8.8.8:8.8.4.4 init=/custom.init\"\n }".
2019-04-03T11:21:41.329092795 [anonymous-instance] The synchronous Put request on "/boot-source" with body "{\n \"kernel_image_path\": \"./hello-vmlinux.bin\",\n \"boot_args\": \"console=ttyS0 reboot=k panic=1 pci=off ip=172.17.100.10::172.17.100.1::myvm:eth0:on:8.8.8.8:8.8.4.4 init=/custom.init\"\n }" was executed successfully. Status code: 204 No Content.
2019-04-03T11:21:41.336662067 [anonymous-instance] The API server received a synchronous Put request on "/drives/rootfs" with body "{\n \"drive_id\": \"rootfs\",\n \"path_on_host\": \"./disk.img\",\n \"is_root_device\": true,\n \"is_read_only\": false\n }".
2019-04-03T11:21:41.336735197 [anonymous-instance] The synchronous Put request on "/drives/rootfs" with body "{\n \"drive_id\": \"rootfs\",\n \"path_on_host\": \"./disk.img\",\n \"is_root_device\": true,\n \"is_read_only\": false\n }" was executed successfully. Status code: 204 No Content.
2019-04-03T11:21:41.344842834 [anonymous-instance] The API server received a synchronous Put request on "/machine-config" with body "{\n \"vcpu_count\": 1,\n \"mem_size_mib\": 32\n }".
2019-04-03T11:21:41.345022508 [anonymous-instance] The synchronous Put request on "/machine-config" with body "{\n \"vcpu_count\": 1,\n \"mem_size_mib\": 32\n }" was executed successfully. Status code: 204 No Content.
2019-04-03T11:21:41.352351031 [anonymous-instance] The API server received a synchronous Put request on "/network-interfaces/eth0" with body "{\n \"iface_id\": \"eth0\",\n \"guest_mac\": \"AA:FC:00:00:00:01\",\n \"host_dev_name\": \"tap0\"\n }".
2019-04-03T11:21:41.352627210 [anonymous-instance] The synchronous Put request on "/network-interfaces/eth0" with body "{\n \"iface_id\": \"eth0\",\n \"guest_mac\": \"AA:FC:00:00:00:01\",\n \"host_dev_name\": \"tap0\"\n }" was executed successfully. Status code: 204 No Content.
2019-04-03T11:21:41.361110644 [anonymous-instance] The API server received a synchronous Put request on "/actions" with body "{\n \"action_type\": \"InstanceStart\"\n }".
2019-04-03T11:21:41.361205150 [anonymous-instance] VMM received instance start command
2019-04-03T11:21:41.361355850 [anonymous-instance] Guest memory starts at 7f0fe500c000
2019-04-03T11:21:41.383692863 [anonymous-instance] The synchronous Put request on "/actions" with body "{\n \"action_type\": \"InstanceStart\"\n }" was executed successfully. Status code: 204 No Content.
dmesg on the host only shows this:
[1100491.901102] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
I can confirm that this happens for me as well with hello-vmlinux.bin. The backtrace from gdb doesn't seem very useful:
#0 epoll_pwait (fd=8, ev=0x202cf40, cnt=100, to=-1, sigs=0x0) at src/linux/epoll.c:29
#1 0x000000000060dd12 in epoll::wait (epfd=8, timeout=-1, buf=...)
at /home/xzaramurd/.cargo/registry/src/github.com-1ecc6299db9ec823/epoll-4.0.1/src/lib.rs:126
#2 0x00000000004ba19b in vmm::Vmm::run_control (self=0x7fb64b706fa0) at vmm/src/lib.rs:1423
#3 0x000000000049adc2 in vmm::start_vmm_thread::{{closure}} () at vmm/src/lib.rs:2031
#4 0x00000000004f300e in std::sys_common::backtrace::__rust_begin_short_backtrace (f=...)
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/sys_common/backtrace.rs:135
#5 0x00000000004e77f2 in std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}} ()
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/thread/mod.rs:469
#6 0x00000000004f1772 in <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (self=..., _args=())
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/panic.rs:309
#7 0x0000000000523b9c in std::panicking::try::do_call (data=0x7fb64b7076e0 "\300C\002\002")
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/panicking.rs:297
#8 0x0000000000b50d89 in __rust_maybe_catch_panic () at src/libpanic_abort/lib.rs:29
#9 0x0000000000523a86 in std::panicking::try (f=...) at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/panicking.rs:276
#10 0x00000000004f1802 in std::panic::catch_unwind (f=...) at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/panic.rs:388
#11 0x00000000004e75d6 in std::thread::Builder::spawn_unchecked::{{closure}} ()
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/libstd/thread/mod.rs:468
#12 0x00000000004e7b53 in <F as alloc::boxed::FnBox<A>>::call_box (self=0x2024f80, args=())
at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/liballoc/boxed.rs:749
#13 0x0000000000b506fe in call_once<(),()> () at /rustc/fc50f328b0353b285421b8ff5d4100966387a997/src/liballoc/boxed.rs:759
#14 start_thread () at src/libstd/sys_common/thread.rs:14
#15 thread_start () at src/libstd/sys/unix/thread.rs:81
#16 0x0000000000b682c5 in start (p=<optimized out>) at src/thread/pthread_create.c:147
#17 0x0000000000b68f85 in __clone () at src/thread/x86_64/clone.s:21
I've tried minimizing the kernel slightly by stripping away the symbols and it still has this issue, but this shouldn't change the layout of the image. I've also tried re-compiling a new version of the kernel (based on 5.1.0) with even less stuff in it, and while the binary file is slightly smaller, it even fails with 33M, but works with slightly more (eg. 35M). After disabling SMP it ended up working with 29M.
So it doesn't seem like 32M is a limitation from firecracker itself, however, it doesn't communicate well what is happening and why it aborts and there might be an issue preventing it from loading the kernel image correctly when the available memory is constrained.
Hi @jeromegn, this is worth looking into, but since it's low priority we will come back to it later.
There was a similar issue with cloud-hypervisor
attempting to boot a 16 MB image which turned out to be a bug: https://github.com/cloud-hypervisor/cloud-hypervisor/issues/803
We should double check if it's the same for Firecracker.
Hi @jeromegn . Sorry for taking long to respond. Is this still a relevant problem for you? We don't have a CI test that boots a 32MB VM, so we can't confidently say that Firecracker supports that, however we think that it is possible that the guest kernel or userspace isn't able to run under such memory constraints, not Firecracker itself. We would like to run a test to verify that, but it isn't going to be our first priority as we don't have an internal usecase for that.
This is not related to Firecracker, but rather how large is your Linux kernel and the needs of your rootfs.
For example, the 4.14 kernel is our smallest kernel:
% ls -l build/img/x86_64/vmlinux-4.14.336
... 20M 2024-03-25T11:54 build/img/x86_64/vmlinux-4.14.336
We can boot it with ~50MB, and you have 24MB available to the guest and 1MB free memory after booting.
./tools/devtool sandbox -- --kernel /firecracker/build/img/x86_64/vmlinux-4.14.336 --guest-mem-size 50MB
root@ubuntu-fc-uvm:~# free -h
total used free shared buff/cache available
Mem: 34Mi 24Mi 1.5Mi 912Ki 12Mi 10Mi
Swap: 0B 0B 0B
This is booting a Ubuntu 22.04 guest rootfs. There is nothing in Firecracker preventing a 32MB VM, if you can get a small enough kernel+rootfs combination. You may be able reduce the memory needs by stripping things you don't need in the kernel and the rootfs.
Closing as this is not a Firecracker issue but rather a guest rootfs+kernel issue.
Feel free to reopen if there if you think the resolution is insufficient or if there's new information.