dnsname icon indicating copy to clipboard operation
dnsname copied to clipboard

dnsmasq process in wrong cgroup

Open AlbanBedel opened this issue 4 years ago • 5 comments

When a pod/container is started directly with the podman command, the dnsmasq process created by dnsname end up in the calling user cgroup:

$ sudo podman start hello
$ cat /proc/$(sudo cat /run/containers/cni/dnsname/podman/pidfile)/cgroup
12:pids:/user.slice/user-1000.slice/[email protected]
11:memory:/user.slice/user-1000.slice/[email protected]
...

That's probably not ideal as the dnsmasq process is then bound to the user slice.

But when systemd service file generated by podman are used, the dnsmasq process ends up in the service's cgroup:

$ cd /run/systemd/system
$ sudo podman generate systemd --files --name hello
$ sudo systemctl start container-hello.service
$ cat /proc/$(sudo cat /run/containers/cni/dnsname/podman/pidfile)/cgroup
12:pids:/system.slice/container-hello.service
11:memory:/system.slice/container-hello.service

This is problematic as the dnsmasq process and the container have totally different life cycles. I noticed this problem as I'm trying to start containers using transient units. Transient units are normally automatically removed when they are stopped, but if the dnsmasq process is still running because of another container, it prevent the transient unit it was started in from being destroyed.

I can probably workaround this problem in some way for my setup, but I think the dnsmasq process, or any other long running process related to a cni network, should be in a cgroup whose life cycle match the cni network life cycle.

AlbanBedel avatar May 19 '20 11:05 AlbanBedel

@mheon WDYT?

baude avatar May 19 '20 13:05 baude

We could try moving to the container's cgroups, but that's problematic because it should outlive any single container as long as another container in the network is started and running.

Best way is likely to make a scope exclusively for dnsmasq (podman-dnsmasq-$NETWORK.service maybe?) under Libpod's default cgroup parent. We have code to do this for cgroupfs and systemd in Podman (we use it for making pod cgroups, but it could easily be repurposed for this).

mheon avatar May 19 '20 14:05 mheon

Is there any workaround for this? This makes using more than one network in podman impossible.

carbolymer avatar Apr 14 '21 19:04 carbolymer

@mheon and what about #12? Each pod can have a unique combination of networks attached, to support that we would probably need a dnsmasq process per pod anyway. It's a larger change but that would solve both bugs at once.

On the other hand it would make sense to have a generic solution to handle the case where a cni plugin start a process that should outlive the pod it was started for.

AlbanBedel avatar Apr 15 '21 07:04 AlbanBedel

@AlbanBedel We're presently discussing an extensive rearchitecture/rewrite of dnsname to resolve that, that should also resolve this. I'm just waiting for the OK to go and ahead and get started on it.

mheon avatar Apr 15 '21 13:04 mheon