Implement VM support for OCI images
Currently OCI images can only be run as containers on top of Incus.
It would make sense to also allow them to run as lightweight virtual machines (--vm flag).
To do so, I expect we'll want to:
- Publish a kernel + initrd combination on the image server (x86_64 and aarch64)
- Have that initrd contain a small loader that will:
- Mount the root filesystem over virtiofs
- Re-shuffle the mount table and pivot_root
- Mount the agent drive(s)
- Spawn
incus-agent - Exec into the container entry point ensuring it's attached to /dev/console
- Have Incus treat all OCI images as both container and VM capable
- For those images, rather than allocate a block volume, just stick with the single-volume as is normally done for containers
- On startup, if dealing with OCI, ensure that we have a load kernel and initrd available for it, then bypass the firmware logic and use direct kernel booting to boot the image
This should give us a very lightweight VM which effectively acts almost identically to a traditional OCI container but can still handle all of our normal VM devices and config options.
+1
Publish a kernel + initrd combination on the image server (x86_64 and aarch64)
Yes, if I'm reading issue correctly a user with the following qemu experience:
qemu-system-x86_64 -enable-kvm -kernel bzImage -initrd rootfs.cpio.gz --append "console=ttyS0 init=/init -nographic
Woud be able to publish/upload their kernel and initrd (bzImage / rootfs.cpio.gz) and run.
For those images, rather than allocate a block volume, just stick with the single-volume as is normally done for containers
Consider having the option to have no volume at all (think of a busybox type scenario where all is in ram and a distro is never reaches pivot_root). imho most users would expect a volume by default- having the option of none is useful as less to tidy up if not needed.
A more complex use case might be how to translate networking expectations from a user -> incus e.g.
qemu-system-x86_64 -enable-kvm -kernelbzImage -initrd rootfs.cpio.gz --append "console=ttyS0 init=/init ip=192.168.1.2:192.168.1.1:192.168.1.1:255.255.255.0::eth0:off" -nographic \
-netdev tap,id=net0,ifname=tap0,script=no,downscript=no\
-device e1000,netdev=net0
Fantastic live demo by the way!
Hello! May @janetkimmm and I take on this issue? We are students at UT Austin taking a virtualization course and wanting to contribute to open source repositories.
Hello! I'll be the one working with her!
@shama7g @janetkimmm I'd recommend against this particular one as it's quite the huge one, not just requiring potentially pretty significant code changes to Incus but also requiring the creation of a custom Linux kernel and minimal Linux environment (initrd) to make it work.
If you're looking for a larger feature to work on, I'd currently recommend one of:
- https://github.com/lxc/incus/issues/1822 (pretty well defined, would solve a problem that many users are running into right now)
- https://github.com/lxc/incus/issues/871 (pretty well defined but workarounds exist so not affecting too many users at the moment)
- https://github.com/lxc/incus/issues/51 (should be easy to implement but exact logic isn't well defined, so will require a bit of thought about the problem prior to implementation)
This issue currently seems to only describe supporting normal OCI images by providing a minimal VM image that can mount and run the OCI image.
Have you also considered supporting booting of bootable container images? (https://docs.fedoraproject.org/en-US/bootc/, https://containers.github.io/bootable/)
Bootable containers is a project pushed by Fedora that allows supported container images to also run on bare metal or vms and those systems to be updated by pulling a newer image. A bootable container image looks the same as a normal OCI image, but also includes a kernel (and the bootc utility to update the OS).
To boot a bootable container image in a VM it first needs to be converted to a disk image. This can either be done with bootc-image-builder or by running the image as a normal container, mounting an empty disk image into the container and running bootc install to-disk --via-loopback [...]
I don't think that those will be supported by the same mechanism. As you've said, the bootc images generally expected to be converted to a full disk and then booted from that using normal UEFI. That's much closer to our normal full VMs than to the OCI VM story.
Instead I suspect we may initially add support for the bootc VMs through incus-migrate as that's the tool which already handles the somewhat similar (from our point of view) OVA format.
We could pretty easily have incus-migrate reuse some of our OCI code to fetch the image, then rely on bootc to convert it to a disk image and finally push it to Incus through our migration API.
Does this will replace traditional VM image on https://images.linuxcontainers.org/ or would this be implemented like a third "type" of instance (VM, micro-VM, container) ? I think getting rid of traditional images will create some issue with distros not booting with a certain kernel or not booting with the way they like. This seems to me more suited for temporary VM.
Traditional images aren't going anywhere. The goal here is to have a full set with:
- container
- container (OCI)
- virtual-machine
- virtual-machine (OCI)
That last one is what will be implemented here, effectively making it possible to do incus launch docker:nginx my-nginx-vm --vm
Any thoughts on when this could move to "Soon"? I understand it's quite a big undertaking.
Likely going to be an early next year kind of thing.