cli icon indicating copy to clipboard operation
cli copied to clipboard

docker.socket systemd unit should depend on nss-user-lookup.target

Open oe-hbk opened this issue 4 years ago • 0 comments

Description

On a system that uses non-local Users/Groups such as with Active Directory integration through SSSD docker.socket systemd can start too early and not be able to lookup the correct SocketGroup= value which points to a non-local Group. The end result is that docker.service does not startup on reboot due to docker.socket dependency not coming up.

Sep 7 22:35:40 HOSTNAME systemd: Failed to chown socket at step GROUP: No such process Sep 7 22:35:40 HOSTNAME systemd: docker.socket control process exited, code=exited status=216 Sep 7 22:35:40 HOSTNAME systemd: Failed to listen on Docker Socket for the API. Sep 7 22:35:40 HOSTNAME systemd: Dependency failed for Docker Application Container Engine. Sep 7 22:35:40 HOSTNAME systemd: Job docker.service/start failed with result 'dependency'. Sep 7 22:35:40 HOSTNAME systemd: Unit docker.socket entered failed state. Sep 7 22:35:40 HOSTNAME systemd: Reached target Sockets.

Steps to reproduce the issue:

  1. Configure SSSD to integrate with Active Directory or LDAP for non-local users&groups
  2. systemctl edit docker.socket and add [Socket] SocketGroup=Name_of_Remote_Group
  3. Reboot machine.

Describe the results you received: Often times docker.service will come up failed because docker.socket failed with the error message Failed to chown socket at step GROUP: No such process

Describe the results you expected: docker.socket should come up fine on reboots. At the moment, one needs to manually restart the docker.socket and docker.service after a reboot.

Additional information you deem important (e.g. issue happens only occasionally): Issue doesn't always happen because SSSD can by chance, order the bootup process so sssd starts up before docker.socket

Output of docker version:

 docker version
Client: Docker Engine - Community
 Version:           19.03.12
 API version:       1.40
 Go version:        go1.13.10
 Git commit:        48a66213fe
 Built:             Mon Jun 22 15:46:54 2020
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          19.03.12
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.13.10
  Git commit:       48a66213fe
  Built:            Mon Jun 22 15:45:28 2020
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.2.13
  GitCommit:        7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc:
  Version:          1.0.0-rc10
  GitCommit:        dc9208a3303feef5b3839f4323d9beb36df0a9dd
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

Output of docker info:

Client:
 Debug Mode: false

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 8
 Server Version: 19.03.12
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: systemd
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
 runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
 init version: fec3683
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-1127.el7.x86_64
 Operating System: Red Hat Enterprise Linux
 OSType: linux
 Architecture: x86_64
 CPUs: 4
 Total Memory: 11.56GiB
 Name: hostname.example.com
 ID: 4JP6:JFPK:6LSX:IKXN:UAZL:EDK5:UJ7L:H4Q6:SMKA:BIUV:M5FA:5QB2
 Docker Root Dir: /data/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.): When the service&socket start up properly, everything works, /var/run/docker.sock is correctly owned by the remote group and members of that group can run docker cli properly.

The solution is to add [Unit] After=nss-user-lookup.target DefaultDependencies=no

to the docker.socket definition. I am temporarily overriding it in /etc/systemd/system/docker.socket.d/override.conf

oe-hbk avatar Sep 08 '20 14:09 oe-hbk