toolbox
toolbox copied to clipboard
Optionally make exec session terminate with parent
Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)
/kind feature
Description
Add a flag to podman exec to make exec session terminate with parent, similar to bubblewrap's bwrap --die-with-parent.
Immutable distributions make use of toolbox / distrobox to provide a mutable environment. A common use is to run commands directly within container (toolbox run [COMMAND] / distrobox enter -- [COMMAND]), since they use exec session, they have the same limitation of not terminating child-proceess when terminal emulator is closed.
Steps to reproduce the issue:
-
Open System Monitor / Task Manager equivilent in your desktop environment, search for
sleep -
Run the following command in your terminal emulator (either one will work):
podman:
podman run --rm -it \
--name debian \
--entrypoint /bin/sh \
docker.io/library/debian:11
# In a new terminal emulator window
podman exec debian sleep 30
toolbox:
toolbox create
toolbox run sleep 30
distrobox:
distrobox create
distrobox enter -- sleep 30
- Then try to close terminal emulator, it'll prompt something like this:

- Insist closing it, then look at System Monitor
Describe the results you received:
sleep 30 still runs within container.
Describe the results you expected:
Nah, this is expected, hence this feature request.
Additional information you deem important (e.g. issue happens only occasionally):
Output of podman version:
Client: Podman Engine
Version: 4.2.1
API Version: 4.2.1
Go Version: go1.18.5
Built: Thu Sep 8 03:58:19 2022
OS/Arch: linux/amd64
Output of podman info:
Click me
host:
arch: amd64
buildahVersion: 1.27.0
cgroupControllers:
- cpu
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: conmon-2.1.4-3.fc36.x86_64
path: /usr/bin/conmon
version: 'conmon version 2.1.4, commit: '
cpuUtilization:
idlePercent: 60.53
systemPercent: 23.27
userPercent: 16.2
cpus: 4
distribution:
distribution: fedora
variant: silverblue
version: "36"
eventLogger: journald
hostname: fedora
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 100000
size: 65536
kernel: 6.0.5-200.fc36.x86_64
linkmode: dynamic
logDriver: journald
memFree: 224858112
memTotal: 16705081344
networkBackend: netavark
ociRuntime:
name: crun
package: crun-1.6-2.fc36.x86_64
path: /usr/bin/crun
version: |-
crun version 1.6
commit: 18cf2efbb8feb2b2f20e316520e0fd0b6c41ef4d
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /usr/share/containers/seccomp.json
selinuxEnabled: true
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
version: |-
slirp4netns version 1.2.0-beta.0
commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
libslirp: 4.6.1
SLIRP_CONFIG_VERSION_MAX: 3
libseccomp: 2.5.3
swapFree: 28844679168
swapTotal: 34359734272
uptime: 42h 55m 22.00s (Approximately 1.75 days)
plugins:
authorization: null
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- registry.fedoraproject.org
- registry.access.redhat.com
- docker.io
- quay.io
store:
configFile: /var/home/user/.config/containers/storage.conf
containerStore:
number: 2
paused: 0
running: 1
stopped: 1
graphDriverName: overlay
graphOptions: {}
graphRoot: /var/home/user/.local/share/containers/storage
graphRootAllocated: 510389125120
graphRootUsed: 201430126592
graphStatus:
Backing Filesystem: btrfs
Native Overlay Diff: "true"
Supports d_type: "true"
Using metacopy: "false"
imageCopyTmpDir: /var/tmp
imageStore:
number: 104
runRoot: /run/user/1000/containers
volumePath: /var/home/user/.local/share/containers/storage/volumes
version:
APIVersion: 4.2.1
Built: 1662580699
BuiltTime: Thu Sep 8 03:58:19 2022
GitCommit: ""
GoVersion: go1.18.5
Os: linux
OsArch: linux/amd64
Version: 4.2.1
Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):
podman-4.2.1-2.fc36.x86_64
Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)
- Latest: No
- Troubleshooting: Yes
Additional environment details (AWS, VirtualBox, physical, etc.):
Fedora Silverblue 36
It's more accurate to say that we can't make them not terminate when the container exits. The kernel enforces the rule that any PID namespace will kill every process in the namespace if PID 1 in the namespace dies; Podman will take down PID 1, guaranteeing that the kernel will unwind the rest of the namespace. For containers without a PID namespace, it's a bit trickier, but we do have an accurate list of processes in the container, which we then individually kill as part of stopping the container. In short, I strongly doubt your Podman reproducer actually does what you think it does; the kernel simply won't allow that to happen.
Sorry if i didn't explain it deeper. I'm not talking about the process being lingering when container is stopped, as this had never been the case, just as you said.
With toolbox / distrobox executing commands inside of container, the container is NOT stopped after the command is finished.
And the parent I'm talking isn't PID 1 of the container, but the podman exec process in terminal emulator. I guess a better term should be used here, since structurally podman exec process isn't a direct parent of container process.
This is a screenshot which represent the issue better:

If I try to close the terminal emulator, it'll prompt the following:

If I press "Close Terminal", sh, toolbox and podman (which runs exec command) will be terminated because they're child process of the vte session.
However, notice the conmon and its child process sleep aren't part of gnome-terminal-server. When sh is terminated, podman (exec command) will be terminated but the corresponding conmon process will be kept intact. As a result, sleep 30 isn't terminated properly.
And sleep 30 is only used for demonstration. In reality one could run something resource intensive, and then close the terminal emulator not knowing they're lingering in the background.
This is probably only an issue for pet container usecase. toolbox / distrobox tends to start a trap program inside container to keep it running. Anything interactive is executed by podman exec, hence this issue.
The feature request, to be precise, is to add an optional flag that make the conmon process terminates when the corresponding podman process is dead.
@mheon I think the request is basically to not double fork conmon and not let it create a new process group to keep it attached to the podman parent process.
Yes. This would work as well.
@Luap99 Don't know if that works. Conmon dying is only going to take out the first PID the exec session started; anything else it did, probably just reparents on top of PID 1 in the container. So we can definitely kill a single-process exec session, but a podman exec -ti $ctr bash like Toolbox does, we only get bash, not anything bash was doing (unless the shell automatically kills its children on exit, not something we can guarantee for every program).
We don't really have a robust way of tracking what processes were spawned from an exec session right now. We'd basically have to walk the process tree in the container, which seems potentially racy. On CGv2, a child cgroup might be a solution? Just need to make sure it doesn't interfere with the container itself being stopped...
I believe it walks the cgroup and kills all of the pids within the cgroup, or at least I remember this is what we wrote many years ago.
I wonder if Toolbx could detect this scenario and explicitly terminate the process that it had launched inside the container.
A friendly reminder that this issue had no activity for 30 days.
This seems like more of an issue for toolbx rather then podman.
This seems like more of an issue for toolbx rather then podman.
Umm... it's not really clear to me what Toolbx could do here. Is there a recommended way to get to the process ID of the conmon process?
@Luap99 Don't know if that works. Conmon dying is only going to take out the first PID the exec session started;
I think it's good enough if conmon died and took out the first PID that the exec session started, because ...
anything else it did, probably just reparents on top of PID 1 in the container. So we can definitely kill a single-process exec session, but a
podman exec -ti $ctr bashlike Toolbox does, we only get bash, not anything bash was doing (unless the shell automatically kills its children on exit, not something we can guarantee for every program).
... if this was a shell directly running on the host without involving any containers, the expectation is that closing the terminal emulator takes out the shell and anything that's willing to die with it. If someone started a process in the background (say, sleep +Inf &), then it's OK if it keeps running in the background.
@giuseppe ameliorated one problematic outcome of this - the processes inside the exec sessions blocking shutdown. See https://github.com/containers/podman/pull/17025
However, it's still worth trying to ensure that the processes inside the exec session goes away as soon as the terminal emulator is closed, just as it happens when one is working directly on the host.
I have to say that I am a bit puzzled that the processes are outliving their controlling terminal. I know there's an inner nested terminal device for the container, but isn't it supposed to go away with the outer terminal?
This seems like more of an issue for toolbx rather then podman.
Umm... it's not really clear to me what Toolbx could do here. Is there a recommended way to get to the process ID of the conmon process?
@mheon @rhatdan @Luap99 @giuseppe Could one of you please help answer this question?
We are brainstorming various options at https://github.com/containers/toolbox/pull/1207 but it's not clear if it's possible for the podman exec caller to get the process ID of conmon(8) or the process inside the container.
Also, it's not clear to me why podman exec --interactive --tty should not terminate the foreground container process with it. Especially when podman exec -it is getting terminated by a SIGHUP from its controlling terminal.
I wonder if it will be easier for you to just use the OCI runtime to do the exec.
e.g. if you do crun exec you circumvent podman and conmon, I am fine to add something like --die-with-parent to crun in a similar way to what bwrap does.
Can you please play with it and see if "crun exec" does all you need?
Adding it to Podman/conmon will be much more complicated, we will need to change the way conmon works to not perform a double fork.
That said podman run is forwarding all signals (well the ones that can be caught) into the container so maybe should podman exec do that to.
ref https://github.com/containers/toolbox/issues/1400