youki icon indicating copy to clipboard operation
youki copied to clipboard

prestart_fail.t leaves a stale youki process running

Open dgibson opened this issue 4 years ago • 4 comments

Although all the runtime-tools testcases run by the integration_tests.sh script pass, one of the - prestart_fail.t - leaves a stale youki process running that must be manually removed with kill -9. I'm assuming this is some sort of failure in the cleanup / exit path.

$ ps afx |grep youki
 199557 pts/6    S+     0:00  |   \_ grep youki
$ ./integration_test.sh prestart_fail
Running prestart_fail/prestart_fail.t
$ ps afx |grep youki
 199599 pts/6    S+     0:00  |   \_ grep youki
 199584 pts/6    S      0:00 /home/dwg/src/youki/youki create --bundle /tmp/ocitest617022754 8d19e193-20f6-49f4-87aa-f6e73afe57b6
$ sudo kill -15 199584
$ ps afx |grep youki
 199638 pts/6    S+     0:00  |   \_ grep youki
 199584 pts/6    S      0:00 /home/dwg/src/youki/youki create --bundle /tmp/ocitest617022754 8d19e193-20f6-49f4-87aa-f6e73afe57b6
$ sudo kill -9 199584
$ ps afx |grep youki
 199653 pts/6    S+     0:00  |   \_ grep youki

dgibson avatar Nov 24 '21 03:11 dgibson

The logfile contains the following:

$ cat ./integration_test/src/github.com/opencontainers/runtime-tools/log/prestart_fail/prestart_fail.t.log 
TAP version 13
failed to start the container
[DEBUG crates/libcontainer/src/hooks.rs:38] 2021-11-24T14:37:06.147184946+11:00 run_hooks arg0: "false", args: []
[DEBUG crates/libcontainer/src/hooks.rs:49] 2021-11-24T14:37:06.147249845+11:00 run_hooks envs: {}
Error: failed to start container 8d19e193-20f6-49f4-87aa-f6e73afe57b6

Caused by:
    0: failed to run pre start hooks
    1: Failed to execute hook command. Non-zero return code. 1
  ---
  {
    "error": "if any prestart hook fails, the runtime MUST generate an error, stop the container, and continue the lifecycle at step 9\nRefer to: https://github.com/opencontainers/runtime-spec/blob/v1.0.2-dev/runtime.md#lifecycle"
  }
  ...
1..0

dgibson avatar Nov 24 '21 03:11 dgibson

@dgibson Thanks for your report. Could I ask you to put the result of ./youki info?

utam0k avatar Nov 24 '21 08:11 utam0k

Sure

$ ./youki info
Version           0.0.1
Kernel-Release    5.14.18-300.fc35.x86_64
Kernel-Version    #1 SMP Fri Nov 12 16:43:17 UTC 2021
Architecture      x86_64
Operating System  Fedora Linux 35 (Thirty Five)
Cores             8
Total Memory      31876
Cgroup setup      hybrid
Cgroup mounts
  blkio           /sys/fs/cgroup/blkio
  cpu             /sys/fs/cgroup/cpu,cpuacct
  cpuacct         /sys/fs/cgroup/cpu,cpuacct
  cpuset          /sys/fs/cgroup/cpuset
  devices         /sys/fs/cgroup/devices
  freezer         /sys/fs/cgroup/freezer
  hugetlb         /sys/fs/cgroup/hugetlb
  memory          /sys/fs/cgroup/memory
  net_cls         /sys/fs/cgroup/net_cls,net_prio
  net_prio        /sys/fs/cgroup/net_cls,net_prio
  perf_event      /sys/fs/cgroup/perf_event
  pids            /sys/fs/cgroup/pids
  unified         /sys/fs/cgroup/unified
CGroup v2 controllers
  cpu             detached
  cpuset          detached
  hugetlb         detached
  io              detached
  memory          detached
  pids            detached
  device          attached
Namespaces        enabled
  mount           enabled
  uts             enabled
  ipc             enabled
  user            enabled
  pid             enabled
  network         enabled
  cgroup          enabled

dgibson avatar Nov 24 '21 09:11 dgibson

Let me help and take a look

yihuaf avatar Nov 28 '21 04:11 yihuaf

Based on the runtime spec, when the prestart hook fails, the container should be stopped and kills the container process. The container should end in the state of STOPPED.

Reference:

The prestart hooks MUST be invoked by the runtime. If any prestart hook fails, the runtime MUST generate an error, stop the container, and continue the lifecycle at step 12.

yihuaf avatar Mar 29 '23 21:03 yihuaf