runtime-spec icon indicating copy to clipboard operation
runtime-spec copied to clipboard

Container status wrt reboots and hooks

Open alban opened this issue 7 years ago • 7 comments

The spec defines 4 statuses:

  • creating
  • created
  • running
  • stopped

What's happening when a container is created and then the machine is rebooted? Should the container be deleted, or should it be in the "stopped" state?

The spec also defines hooks and each hook receives the state of the container from stdin. What should be the status of the container in each hook?

  • prestart hook: I guess the state should be "created"
  • poststart hook: I guess the state should be "running"
  • poststop: since the hook is called after the container is deleted, the state is not "stopped" anymore but maybe some fantoom "deleted" state?

Could the spec clarify those two points?

alban avatar Mar 12 '18 16:03 alban

Your question about rebooting is already covered by the lifecycle and specifically, step 7. A reboot causes the container process to exit, but the delete operation was not executed -- so the container is naturally in the "stopped" state.

  • prestart hook: I guess the state should be "created"
  • poststart hook: I guess the state should be "running"

These are correct in the most obvious cases, but the spec allows runtimes to add additional states which could also be returned.

  • poststop: since the hook is called after the container is deleted, the state is not "stopped" anymore but maybe some fantoom "deleted" state?

I'm not sure about this one, but from memory this is something that some folks from @opencontainers/runtime-tools-maintainers have been working on ironing out. As far as I can tell, runc changes the state to stopped after running the post-stop hooks (which seems wrong to me). Maybe there should be a deleted state.

cyphar avatar Mar 12 '18 16:03 cyphar

How would there be a deleted state...when...it's...deleted?

crosbymichael avatar Mar 12 '18 16:03 crosbymichael

On a reboot, it would be gone because the state is kept in /run and /run is in memory.

crosbymichael avatar Mar 12 '18 16:03 crosbymichael

On Mon, Mar 12, 2018 at 09:25:59AM -0700, Alban Crequy wrote:

  • poststop: since the hook is called after the container is deleted, the state is not "stopped" anymore but maybe some fantoom "deleted" state?

I'm fine with ‘stopped’ here, because the spec definition still applies 1. I'm also fine adding a new ‘deleted’ state to cover this situtation.

Either way, the only consumers who should see this state are the post-stop hooks. After the core of ‘delete’ completes (before the post-stop hooks start firing), external folks calling ‘state’ on the container may be getting “container not found” errors for that container ID. Although it may be worth tightening up the “does not exist” language in 2 to clarify that point.

wking avatar Mar 12 '18 23:03 wking

On a reboot, it would be gone because the state is kept in /run and /run is in memory.

This makes sense (for runc), but so does @cyphar's argument for stopped. I think we should add spec wording to explicitly allow runtimes to forget about all of their containets on reboot, so folks relying on post-stop hooks for cleanup know that they might have to handle post-reboot cleanup themselves.

wking avatar Mar 13 '18 18:03 wking

I'm just a newcomer, but basically I agree with @alban. Currently it's a little confusing.

At the moment the spec defines an action Delete, and a status stopped. The stopped status is used in a mixed way, especially by runc, which has only one action Destroy that does actually both killing a process and deleting a container.

So if we add deleted status that is distinguished from stopped, we would also want to add Stop action, so that Stop results in stopped and Delete results in deleted. Also we would need to change runc to make it handle both actions Stop and Delete in a separate way.

dongsupark avatar Apr 04 '18 13:04 dongsupark

On Wed, Apr 04, 2018 at 01:49:10PM +0000, Dongsu Park wrote:

So if we add deleted status that is distinguished from stopped, we would also want to add Stop action…

We don't need a Stop action. We already have kill 1, and if you use this to send TERM or KILL (which API implementations MUST support 2) you can force the container process to exit (which is the ‘stopped’ state 3).

wking avatar Apr 04 '18 17:04 wking