podman icon indicating copy to clipboard operation
podman copied to clipboard

Podman machine does not stop correctly while running a container

Open cbr7 opened this issue 10 months ago • 5 comments

Issue Description

On version 5.0.2 on macOS it seems that it's not possible to correctly stop the podman machine if it has at least an active container running.

Steps to reproduce the issue

Steps to reproduce the issue

  1. Have podman 5.0.2 installed
  2. Create a podman machine.
  3. pull and image and run it as a container.
  4. after container start up try to stop the podman machine
  5. Notice that "Error: failed waiting for vm to stop" error is thrown.
  6. At this point the podman machine is still showing as running in podman machine list but running podman images throws the following error: "Cannot connect to Podman. Please verify your connection to the Linux system using podman system connection list, or try podman machine init and podman machine start to manage a new Linux VM Error: unable to connect to Podman socket: failed to connect: ssh: handshake failed: read tcp 127.0.0.1:58659->127.0.0.1:53782: read: connection reset by peer"

Describe the results you received

Error thrown when stopping podman machine

Describe the results you expected

Podman machine successfully stops

podman info output

Error: failed waiting for vm to stop

Error: failed waiting for vm to stopode 125

============================================

vladimirlazar@Vladimirs-MacBook-Pro-2 ~ % podman images
Cannot connect to Podman. Please verify your connection to the Linux system using `podman system connection list`, or try `podman machine init` and `podman machine start` to manage a new Linux VM
Error: unable to connect to Podman socket: failed to connect: ssh: handshake failed: read tcp 127.0.0.1:56370->127.0.0.1:53782: read: connection reset by peer

Podman in a container

No

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

vladimirlazar@Vladimirs-MacBook-Pro-2 ~ % podman version Client: Podman Engine Version: 5.0.2 API Version: 5.0.2 Go Version: go1.22.2 Git Commit: 3304dd95b8978a8346b96b7d43134990609b3b29 Built: Wed Apr 17 21:13:18 2024 OS/Arch: darwin/arm64

Server: Podman Engine Version: 5.0.2 API Version: 5.0.2 Go Version: go1.21.9 Built: Wed Apr 17 02:00:00 2024 OS/Arch: linux/arm64 vladimirlazar@Vladimirs-MacBook-Pro-2 ~ % clear vladimirlazar@Vladimirs-MacBook-Pro-2 ~ % podman version Client: Podman Engine Version: 5.0.2 API Version: 5.0.2 Go Version: go1.22.2 Git Commit: 3304dd95b8978a8346b96b7d43134990609b3b29 Built: Wed Apr 17 21:13:18 2024 OS/Arch: darwin/arm64

Server: Podman Engine Version: 5.0.2 API Version: 5.0.2 Go Version: go1.21.9 Built: Wed Apr 17 02:00:00 2024 OS/Arch: linux/arm64 vladimirlazar@Vladimirs-MacBook-Pro-2 ~ % podman info host: arch: arm64 buildahVersion: 1.35.3 cgroupControllers:

  • cpu
  • io
  • memory
  • pids cgroupManager: systemd cgroupVersion: v2 conmon: package: conmon-2.1.10-1.fc39.aarch64 path: /usr/bin/conmon version: 'conmon version 2.1.10, commit: ' cpuUtilization: idlePercent: 97.55 systemPercent: 1.36 userPercent: 1.09 cpus: 6 databaseBackend: sqlite distribution: distribution: fedora variant: coreos version: "39" eventLogger: journald freeLocks: 2048 hostname: localhost.localdomain idMappings: gidmap:
    • container_id: 0 host_id: 1000 size: 1
    • container_id: 1 host_id: 100000 size: 1000000 uidmap:
    • container_id: 0 host_id: 501 size: 1
    • container_id: 1 host_id: 100000 size: 1000000 kernel: 6.8.4-200.fc39.aarch64 linkmode: dynamic logDriver: journald memFree: 12158222336 memTotal: 12620021760 networkBackend: netavark networkBackendInfo: backend: netavark dns: package: aardvark-dns-1.10.0-1.fc39.aarch64 path: /usr/libexec/podman/aardvark-dns version: aardvark-dns 1.10.0 package: netavark-1.10.3-1.fc39.aarch64 path: /usr/libexec/podman/netavark version: netavark 1.10.3 ociRuntime: name: crun package: crun-1.14.4-1.fc39.aarch64 path: /usr/bin/crun version: |- crun version 1.14.4 commit: a220ca661ce078f2c37b38c92e66cf66c012d9c1 rundir: /run/user/501/crun spec: 1.0.0 +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL os: linux pasta: executable: /usr/bin/pasta package: passt-0^20240405.g954589b-1.fc39.aarch64 version: | pasta 0^20240405.g954589b-1.fc39.aarch64-pasta Copyright Red Hat GNU General Public License, version 2 or later https://www.gnu.org/licenses/old-licenses/gpl-2.0.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. remoteSocket: exists: true path: /run/user/501/podman/podman.sock security: apparmorEnabled: false capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT rootless: true seccompEnabled: true seccompProfilePath: /usr/share/containers/seccomp.json selinuxEnabled: true serviceIsRemote: true slirp4netns: executable: /usr/bin/slirp4netns package: slirp4netns-1.2.2-1.fc39.aarch64 version: |- slirp4netns version 1.2.2 commit: 0ee2d87523e906518d34a6b423271e4826f71faf libslirp: 4.7.0 SLIRP_CONFIG_VERSION_MAX: 4 libseccomp: 2.5.3 swapFree: 0 swapTotal: 0 uptime: 0h 1m 32.00s variant: v8 plugins: authorization: null log:
  • k8s-file
  • none
  • passthrough
  • journald network:
  • bridge
  • macvlan
  • ipvlan volume:
  • local registries: search:
  • docker.io store: configFile: /var/home/core/.config/containers/storage.conf containerStore: number: 0 paused: 0 running: 0 stopped: 0 graphDriverName: overlay graphOptions: {} graphRoot: /var/home/core/.local/share/containers/storage graphRootAllocated: 99252940800 graphRootUsed: 3804274688 graphStatus: Backing Filesystem: xfs Native Overlay Diff: "true" Supports d_type: "true" Supports shifting: "false" Supports volatile: "true" Using metacopy: "false" imageCopyTmpDir: /var/tmp imageStore: number: 0 runRoot: /run/user/501/containers transientStore: false volumePath: /var/home/core/.local/share/containers/storage/volumes version: APIVersion: 5.0.2 Built: 1713312000 BuiltTime: Wed Apr 17 02:00:00 2024 GitCommit: "" GoVersion: go1.21.9 Os: linux OsArch: linux/arm64 Version: 5.0.2

Additional information

Seems to happen consistently on macOS, but was not able to reproduce on Windows 11.

cbr7 avatar Apr 26 '24 10:04 cbr7

@cbr7 could you add the image you're using / pulling /running

benoitf avatar Apr 26 '24 10:04 benoitf

@benoitf I was able to reproduce the issue with the image ghcr.io/linuxcontainers/alpine:latest.

cbr7 avatar Apr 26 '24 10:04 cbr7

$ podman machine start
$ podman run --rm -it fedora

another terminal:

podman machine stop

then it's delayed by 1mn30 image

benoitf avatar Apr 26 '24 10:04 benoitf

From some internal discussion:

  1. podman machine stop should wait longer (at least 90 seconds) as shutdown can be delayed for many reason.
  2. Investigate a better way to stop containers when they don't react to sigterm (the default podman timeout is 10s) so we should likely not rely on systemd to stop it and wait 90s.

Luap99 avatar Apr 26 '24 14:04 Luap99

For podman machine possibly investigate reducing the 90s systemd timeout as well? When I want the VM down, I want it down quickly, and it's unlikely that containers in a machine VM are production-critical - early SIGKILL shouldn't hurt that much.

mheon avatar Apr 26 '24 14:04 mheon

A friendly reminder that this issue had no activity for 30 days.

github-actions[bot] avatar May 27 '24 00:05 github-actions[bot]

Any update on the issue?

odockal avatar Jun 25 '24 11:06 odockal

Yes for 2, https://github.com/containers/podman/pull/23064 fixes the long stop systemd timeout issue when the container does not exit on sigterm.

For 1 I can open a PR to increase the timeout. I guess at some point (maybe after 90s) we should terminate the VM forcefully and print a warning. I don't think machine stop should ever return an error if the shutdown takes to long.

Luap99 avatar Jun 25 '24 11:06 Luap99

Feel free to test if https://github.com/containers/podman/pull/23097 works for you

Luap99 avatar Jun 25 '24 14:06 Luap99

@Luap99 Thanks! @cbr7 Can you take a look, please?

odockal avatar Jun 26 '24 07:06 odockal

@odockal sure

cbr7 avatar Jun 26 '24 07:06 cbr7