youki
youki copied to clipboard
Support checkpoint and restore
Checkpoint and restore is supported by runc. We should also support these operations. There does not seem to be a crate that allows interacting with criu, so we probably have to write it ourselves. The go implementation would be a good starting point. If anyone has more information on this topic it would be appreciated.
Is there any thing I can help ?
@Furisto Have you started implementing this yet?
I have started with the implementation. Will let you know if I need support or if we can divide it up.
Hey @duduainankai I could use your help. What I have done so far:
- I have generated the code for the criu protobuf messages
- I can start criu in swrk mode
- I can send a message to criu to dump a simple process
- The message is processed and the process successfully dumped according to the criu logs. The image files are written to the output folder.
- I am waiting to receive a response from criu to tell me everything went well ... and nothing happens, I am never receiving a response
I am following the steps outlined here and in runc. Maybe you have an idea?
Hey @duduainankai I could use your help. What I have done so far:
- I have generated the code for the criu protobuf messages
- I can start criu in swrk mode
- I can send a message to criu to dump a simple process
- The message is processed and the process successfully dumped according to the criu logs. The image files are written to the output folder.
- I am waiting to receive a response from criu to tell me everything went well ... and nothing happens, I am never receiving a response
I am following the steps outlined here and in runc. Maybe you have an idea?
Got. I will check on what you have listed and see what I can do. @Furisto
I started the discussion about a CRIU rust interface here https://github.com/checkpoint-restore/criu/issues/1722
@adrianreber Thanks! Will take a look.
I tried to checkpoint a container and it almost works. The main problem currently is that -d
--detach
is missing which does a setsid()
. Are there any plans to implement --detach
soon? With --detach
checkpointing should be possible pretty easily.
Checkpointing works now, if I do setsid()
after starting the container with stdin, stdout, stderr redirected to /dev/null.
It is also important that the youki re-opens /dev/null
inside the container like I did it for crun
: https://github.com/containers/crun/commit/bbb1fa9f380b0606165222a2414cb7d4d45dd97f
This is also needed: https://github.com/containers/youki/issues/623
@adrianreber Awesome! Oh, I forgot about it 😭 Can I ask you to create an issue?
Are there any plans to implement --detach soon?
- https://github.com/containers/youki/issues/628
- https://github.com/containers/youki/pull/627
- https://github.com/containers/youki/issues/629
Sorry to ping the issue subscribers here, but is this still relevant? Two of the thing mentioned in above have been merged/closed, and the first issue has a related PR which has been merged. If I recall correctly, there are also some integration tests regarding the checkpoint-restore functionality. If this support is done, can we close this issue? If not, what else might be needed for it to work?
I only implemented the checkpoint part. I did not find the time yet to implement the restore part of the code. Once restore is implemented this can be closed.