cli
cli copied to clipboard
docker.socket systemd unit should depend on nss-user-lookup.target
Description
On a system that uses non-local Users/Groups such as with Active Directory integration through SSSD docker.socket systemd can start too early and not be able to lookup the correct SocketGroup= value which points to a non-local Group. The end result is that docker.service does not startup on reboot due to docker.socket dependency not coming up.
Sep 7 22:35:40 HOSTNAME systemd: Failed to chown socket at step GROUP: No such process Sep 7 22:35:40 HOSTNAME systemd: docker.socket control process exited, code=exited status=216 Sep 7 22:35:40 HOSTNAME systemd: Failed to listen on Docker Socket for the API. Sep 7 22:35:40 HOSTNAME systemd: Dependency failed for Docker Application Container Engine. Sep 7 22:35:40 HOSTNAME systemd: Job docker.service/start failed with result 'dependency'. Sep 7 22:35:40 HOSTNAME systemd: Unit docker.socket entered failed state. Sep 7 22:35:40 HOSTNAME systemd: Reached target Sockets.
Steps to reproduce the issue:
- Configure SSSD to integrate with Active Directory or LDAP for non-local users&groups
- systemctl edit docker.socket and add [Socket] SocketGroup=Name_of_Remote_Group
- Reboot machine.
Describe the results you received: Often times docker.service will come up failed because docker.socket failed with the error message Failed to chown socket at step GROUP: No such process
Describe the results you expected: docker.socket should come up fine on reboots. At the moment, one needs to manually restart the docker.socket and docker.service after a reboot.
Additional information you deem important (e.g. issue happens only occasionally): Issue doesn't always happen because SSSD can by chance, order the bootup process so sssd starts up before docker.socket
Output of docker version
:
docker version
Client: Docker Engine - Community
Version: 19.03.12
API version: 1.40
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:46:54 2020
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 19.03.12
API version: 1.40 (minimum version 1.12)
Go version: go1.13.10
Git commit: 48a66213fe
Built: Mon Jun 22 15:45:28 2020
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.2.13
GitCommit: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc:
Version: 1.0.0-rc10
GitCommit: dc9208a3303feef5b3839f4323d9beb36df0a9dd
docker-init:
Version: 0.18.0
GitCommit: fec3683
Output of docker info
:
Client:
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 8
Server Version: 19.03.12
Storage Driver: overlay2
Backing Filesystem: xfs
Supports d_type: true
Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: systemd
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 7ad184331fa3e55e52b890ea95e65ba581ae3429
runc version: dc9208a3303feef5b3839f4323d9beb36df0a9dd
init version: fec3683
Security Options:
seccomp
Profile: default
Kernel Version: 3.10.0-1127.el7.x86_64
Operating System: Red Hat Enterprise Linux
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 11.56GiB
Name: hostname.example.com
ID: 4JP6:JFPK:6LSX:IKXN:UAZL:EDK5:UJ7L:H4Q6:SMKA:BIUV:M5FA:5QB2
Docker Root Dir: /data/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional environment details (AWS, VirtualBox, physical, etc.): When the service&socket start up properly, everything works, /var/run/docker.sock is correctly owned by the remote group and members of that group can run docker cli properly.
The solution is to add [Unit] After=nss-user-lookup.target DefaultDependencies=no
to the docker.socket definition. I am temporarily overriding it in /etc/systemd/system/docker.socket.d/override.conf