runc icon indicating copy to clipboard operation
runc copied to clipboard

runc checkpoint: destroy container error

Open gosoon opened this issue 3 years ago • 2 comments

I create a runc container mycontainerid with README using-runc section.But exec runc checkpoint mycontainerid cmd failed with error when container is running or paused.As shown below:

# runc list
ID              PID         STATUS      BUNDLE         CREATED                          OWNER
mycontainerid   16670       running     /mycontainer   2022-08-28T11:49:44.809303123Z   root


# runc checkpoint mycontainerid
ERRO[0000] container still running
ERRO[0000] criu failed: type NOTIFY errno 0
log file: dump.log


# runc list
ID              PID         STATUS      BUNDLE         CREATED                          OWNER
mycontainerid   16670       paused      /mycontainer   2022-08-28T11:49:44.809303123Z   root

# runc checkpoint mycontainerid
ERRO[0000] container paused
ERRO[0000] criu failed: type NOTIFY errno 0
log file: dump.log


# runc -v
runc version 1.1.0+dev
commit: v1.1.0-272-g4a51b04
spec: 1.0.2-dev
go: go1.19
libseccomp: 2.3.1

gosoon avatar Aug 28 '22 12:08 gosoon

So, for some reason, container checkpoint failed. It can happen, and in general this is not a bug per se.

Are you saying that the issue here is leaving the container in a paused state after a failed checkpointing?

kolyshkin avatar Aug 29 '22 18:08 kolyshkin

@kolyshkin I'm sorry, I may not have described it clearly.

The above are two cases that execute runc checkpoint mycontainerid failed.

When container is paused or running status, execute runc checkpoint and the container need be destroy by default,not use --leave-running or --pre-dump flag.But it actually fails in the destroy container stage,the error message is container still running when container is running status,the error message is container paused when container is paused status.

I looked through the Runc code, the container need be destroy when execute runc checkpoint not use --leave-running or --pre-dump flag.The code is shown below:

https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/checkpoint.go#L70-L73

The code is shown below and have an error ErrRunning , if destroy running status container:

https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/libcontainer/state_linux.go#L131-L136

The code is shown below and have an error ErrPaused , if destroy paused status container:

https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/libcontainer/state_linux.go#L183-L192

The destroy func is only remove for Created or Stopped status container,other status container need use killContainer func,refer to the code for the runc delete command:

https://github.com/opencontainers/runc/blob/4a51b047036cf16d4f124548c2a7ff24b5640bad/delete.go#L73-L83

If you still don't understand, you can try to executing runc checkpoint xxx command in each status(Created,Running,Paused,Stopped) of the container.

gosoon avatar Aug 30 '22 03:08 gosoon