sysbox icon indicating copy to clipboard operation
sysbox copied to clipboard

Add support for checkpoint

Open stijndehaes opened this issue 2 years ago • 4 comments

Is it possible to add checkpointing support to sysbox-runc?

I tried to checkpoint a container running under sysbox and got the following error:

configured runtime does not support checkpoint/restore

Also noticed the checkpoint restore tests are disabled here: https://github.com/nestybox/sysbox-runc/blob/60ca93c783b19c63581e34aa183421ce0b9b26b7/tests/integration/help.bats#L19

stijndehaes avatar Jun 29 '23 12:06 stijndehaes

I spent some time checking out the code and it looks like the checkpoint command is available but with the caveat that it is untested.

However I can see in this commit it was removed from sysbox-runc: https://github.com/nestybox/sysbox-runc/commit/ca1deedadeca3b3e6e3bf5654b1656e56058fa70

I am guessing it probably will not work?

stijndehaes avatar Jun 29 '23 13:06 stijndehaes

Hi @stijndehaes,

Sysbox does not support checkpoint / restore yet (see Limitations).

It can be achieved, but it's a heavy lift engineering wise because Sysbox holds more container state than the OCI runc does (e.g., it partially virtualizes /proc and /sys inside the container), so checkpoint / restore must save and restore that state to/from disk.

We don't have any near term plans to add this functionality, but curious about your use case for it in the context of Sysbox containers.

Thanks!

ctalledo avatar Jul 02 '23 05:07 ctalledo

@ctalledo I spent some time looking at it an indeed things are not so easy :)

Our use case is for hosting IDE's with sysbox, we want to be able to suspend a running IDE when the user is not actively using it. And spin it back up when needed.

The alternative I am currently exploring is taking creating a tarball from the file system of the container. I use the CRI api to find the root volume on the node, and use container archive functionality to create a tarball of that folder. And then use that to create a container image I can push to a registry

stijndehaes avatar Jul 06 '23 05:07 stijndehaes

Thanks @stijndehaes for the insight.

The alternative I am currently exploring is taking creating a tarball from the file system of the container.

That can be a bit tricky because Sysbox creates implicit mounts into the container at mountpoints such as /var/lib/docker and similar. It does this to ensure Docker, containerd, K8s, etc., can run inside the container (i.e., it's a trick to avoid overlayfs-on-overlayfs). Thus, capturing the container state requires not only capturing it's root filesytem, but also the implicit mounts which are stored on the host under /var/lib/sysbox.

Sysbox does support docker commit which will capture the entire container filesystem state (including the implicit mounts) into a new container image, so you may help. It's not the same as checkpoint / restore because it does not capture in-memory state of the container, but it's the closest I think.

Hope that helps.

ctalledo avatar Jul 06 '23 14:07 ctalledo