sysbox
sysbox copied to clipboard
Add support for checkpoint
Is it possible to add checkpointing support to sysbox-runc?
I tried to checkpoint a container running under sysbox and got the following error:
configured runtime does not support checkpoint/restore
Also noticed the checkpoint restore tests are disabled here: https://github.com/nestybox/sysbox-runc/blob/60ca93c783b19c63581e34aa183421ce0b9b26b7/tests/integration/help.bats#L19
I spent some time checking out the code and it looks like the checkpoint command is available but with the caveat that it is untested.
However I can see in this commit it was removed from sysbox-runc: https://github.com/nestybox/sysbox-runc/commit/ca1deedadeca3b3e6e3bf5654b1656e56058fa70
I am guessing it probably will not work?
Hi @stijndehaes,
Sysbox does not support checkpoint / restore yet (see Limitations).
It can be achieved, but it's a heavy lift engineering wise because Sysbox holds more container state than the OCI runc does (e.g., it partially virtualizes /proc and /sys inside the container), so checkpoint / restore must save and restore that state to/from disk.
We don't have any near term plans to add this functionality, but curious about your use case for it in the context of Sysbox containers.
Thanks!
@ctalledo I spent some time looking at it an indeed things are not so easy :)
Our use case is for hosting IDE's with sysbox, we want to be able to suspend a running IDE when the user is not actively using it. And spin it back up when needed.
The alternative I am currently exploring is taking creating a tarball from the file system of the container. I use the CRI api to find the root volume on the node, and use container archive functionality to create a tarball of that folder. And then use that to create a container image I can push to a registry
Thanks @stijndehaes for the insight.
The alternative I am currently exploring is taking creating a tarball from the file system of the container.
That can be a bit tricky because Sysbox creates implicit mounts into the container at mountpoints such as /var/lib/docker and similar. It does this to ensure Docker, containerd, K8s, etc., can run inside the container (i.e., it's a trick to avoid overlayfs-on-overlayfs). Thus, capturing the container state requires not only capturing it's root filesytem, but also the implicit mounts which are stored on the host under /var/lib/sysbox.
Sysbox does support docker commit which will capture the entire container filesystem state (including the implicit mounts) into a new container image, so you may help. It's not the same as checkpoint / restore because it does not capture in-memory state of the container, but it's the closest I think.
Hope that helps.