runc checkpoint: destroy container error
I create a runc container mycontainerid with README using-runc section.But exec runc checkpoint mycontainerid cmd failed with error when container is running or paused.As shown below:
# runc list
ID PID STATUS BUNDLE CREATED OWNER
mycontainerid 16670 running /mycontainer 2022-08-28T11:49:44.809303123Z root
# runc checkpoint mycontainerid
ERRO[0000] container still running
ERRO[0000] criu failed: type NOTIFY errno 0
log file: dump.log
# runc list
ID PID STATUS BUNDLE CREATED OWNER
mycontainerid 16670 paused /mycontainer 2022-08-28T11:49:44.809303123Z root
# runc checkpoint mycontainerid
ERRO[0000] container paused
ERRO[0000] criu failed: type NOTIFY errno 0
log file: dump.log
# runc -v
runc version 1.1.0+dev
commit: v1.1.0-272-g4a51b04
spec: 1.0.2-dev
go: go1.19
libseccomp: 2.3.1
So, for some reason, container checkpoint failed. It can happen, and in general this is not a bug per se.
Are you saying that the issue here is leaving the container in a paused state after a failed checkpointing?
@kolyshkin I'm sorry, I may not have described it clearly.
The above are two cases that execute runc checkpoint mycontainerid failed.
When container is paused or running status, execute runc checkpoint and the container need be destroy by default,not use --leave-running or --pre-dump flag.But it actually fails in the destroy container stage,the error message is container still running when container is running status,the error message is container paused when container is paused status.
I looked through the Runc code, the container need be destroy when execute runc checkpoint not use --leave-running or --pre-dump flag.The code is shown below:
https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/checkpoint.go#L70-L73
The code is shown below and have an error ErrRunning , if destroy running status container:
https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/libcontainer/state_linux.go#L131-L136
The code is shown below and have an error ErrPaused , if destroy paused status container:
https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/libcontainer/state_linux.go#L183-L192
The destroy func is only remove for Created or Stopped status container,other status container need use killContainer func,refer to the code for the runc delete command:
https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/delete.go#L73-L83
If you still don't understand, you can try to executing runc checkpoint xxx command in each status(Created,Running,Paused,Stopped) of the container.