backend.ai
backend.ai copied to clipboard
Generalize docker cgroup driver support
Currently the agent adds our own cgroup control routines to retrieve container statistics of containers after their termination on top of Docker's cgroupfs lifecycle.
This works well with the default cgroup driver, but there are reports that it conflicts with the "systemd" cgroup driver. The relevant configuration option name is native.cgroupdriver
.
There seems no big technical difference between systemd-managed cgroups and cgroupfs-based cgroups, but they have different directory namings under the sysfs. It is also notable that Kubernetes recommends to use the "systemd" cgroup driver and keep the configurations for kubelets and Docker same.
- https://stackoverflow.com/questions/43794169/docker-change-cgroup-driver-to-systemd/65870152
- https://kubernetes.io/docs/setup/production-environment/container-runtimes/
- https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/
Let's support both types of cgroup drivers to prevent unexpected conflicts on various systems, by adding auto-detection of the current cgroup driver and using different directory names that match with each driver when collecting stats.
References on docker metrics:
- https://docs.docker.com/config/containers/runmetrics/
- https://tech.kakao.com/2020/06/29/cgroup-driver/
This is an extension to #134.
Related with #865