for-linux icon indicating copy to clipboard operation
for-linux copied to clipboard

DinD non-TLS / non-SSL mode of operation should not be crippled or removed

Open fredcooke opened this issue 3 years ago • 11 comments

  • [ ] This is a bug report
  • [x] This is a feature request
  • [x] I searched existing issues before opening this one

Expected behavior

Docker DinD Starts Quickly no matter which configuration option is chosen, and non-SSL operation is available long term for valid use cases.

Actual behavior

Docker DinD takes an additional 15 seconds to start when configured for non-SSL use without adding additional arguments to bypass the sleep 15 and warns the user of impending feature removal of non-SSL behaviour.

Steps to reproduce the behavior

export DOCKER_TLS_CERTDIR=""
docker run --privileged --rm -e DOCKER_TLS_CERTDIR -u 0 -it docker:20..10.9--dind-alpine3.14 dockerd-entrypoint.sh

Output of above

INFO[2021-10-10T10:59:23.350415800Z] Starting up                                  
WARN[2021-10-10T10:59:23.352564700Z] could not change group /var/run/docker.sock to docker: group docker not found 
WARN[2021-10-10T10:59:23.352805400Z] Binding to IP address without --tlsverify is insecure and gives root access on this machine to everyone who has access to your network.  host="tcp://0.0.0.0:2375"
WARN[2021-10-10T10:59:23.352874600Z] Binding to an IP address, even on localhost, can also give access to scripts run in a browser. Be safe out there!  host="tcp://0.0.0.0:2375"
WARN[2021-10-10T10:59:24.357979000Z] Binding to an IP address without --tlsverify is deprecated. Startup is intentionally being slowed down to show this message  host="tcp://0.0.0.0:2375"
WARN[2021-10-10T10:59:24.358061900Z] Please consider generating tls certificates with client validation to prevent exposing unauthenticated root access to your network  host="tcp://0.0.0.0:2375"
WARN[2021-10-10T10:59:24.358090100Z] You can override this by explicitly specifying '--tls=false' or '--tlsverify=false'  host="tcp://0.0.0.0:2375"
WARN[2021-10-10T10:59:24.358130200Z] Support for listening on TCP without authentication or explicit intent to run without authentication will be removed in the next release  host="tcp://0.0.0.0:2375"
INFO[2021-10-10T10:59:39.336889500Z] libcontainerd: started new containerd process  pid=39

Output of docker version:

See command above, version explicitly specified.

Output of docker info:

Container behaviour should be identical every time in most cases, --privileged could change this, but it's not relevant to this ticket.

Additional environment details (AWS, VirtualBox, physical, etc.)

DinD

Valid Use Case for non-TLS Operation

When running in Kubernetes and using dind within a multi-container pod structure the underlying realaity is that the pods are on the same host and talking to each other over local loop back and that TLS doesn't really make sense, there is no man in the middle, anyone who is on the box already owns you and your entire cluster.

I've converted our stack over to SSL due to the threats of removal of behaviour and the extra hassle of making it not sleep 15 seconds during startup in a critical application. However the SSL startup is slower and the SSL communications over local loop back are slightly slower/more CPU intensive, and if someone is on the box and wants to MITM on the loop back interface then they have access to the certs/keys on disk anyway.

Thanks to @tianon who suggested putting this here.

Ref docker-library/docker#292

fredcooke avatar Oct 12 '21 01:10 fredcooke

Hmm.. so there is an exception in the code for the daemon listening on a loopback interface, but that won't apply in the docker-in-docker case https://github.com/moby/moby/blob/c4040417b6fe21911dc7ab5e57db27519dd44a6a/cmd/dockerd/daemon.go#L681-L687

Perhaps we need some "i-know-what-im-doing" env-var to skip 🤔

@akihirosuda @cyphar @cpuguy83 any thoughts?

thaJeztah avatar Oct 25 '21 16:10 thaJeztah

It is complaining about 0.0.0.0:2375 requires an explicit --tlsverify=false. It should not have a delay if you explicitly set --tlsverify=false or --tls=false.

cpuguy83 avatar Oct 25 '21 17:10 cpuguy83

@cpuguy83 that is the case, however I will point out that:

--tls=false / --tlsverify=false alone does not result in non-tls behaviour DOCKER_TLS_CERTDIR does result in non-tls behaviour, but with obnoxious 15 second startup delay - print the warning and go for it, or deprecate and fail fast

To get GOOD behaviour you MUST use both things. I feel like the cert dir variable should not matter and the argument should be the only thing to do. A dir setting controlling an entire behaviour? Seriously? :-D

fredcooke avatar Oct 25 '21 20:10 fredcooke

@fredcooke There is no such variable in this codebase. Is that from the dind image?

cpuguy83 avatar Oct 25 '21 20:10 cpuguy83

Yep https://github.com/docker-library/docker/blob/9728dce92752348ac2623bcf96436f1a89e15dd3/20.10/docker-entrypoint.sh#L15-L20

cpuguy83 avatar Oct 25 '21 20:10 cpuguy83

Yep -- to be clear, the feature request I was suggesting was being able to listen on both TLS and non-TLS simultaneously, but the way the flags are designed that's a little complicated.

tianon avatar Oct 25 '21 21:10 tianon

This has caused similar difficulties for us in moving from docker 19.x to 20.10.x - TLS is not necessary in our use case (using the dind container as part of a k8s pod for ephemeral build nodes linked to Jenkins). However the delay in bringing up the daemon causes our builds to fail as the docker daemon doesn't respond quickly enough when the pod is initially brought up. We can configure TLS, but as mentioned by @fredcooke this adds unnecessary overhead where it is not needed.

jossansone avatar Dec 09 '22 17:12 jossansone

After four hours of debugging, I finally came to this issue. 🤦🏻‍♂️ Using DinD with GitLab pipelines, but got effected by this nevertheless.

MindTooth avatar Jan 30 '23 16:01 MindTooth

my solution was to run dind with a volume attached to daemon.json that has {tls: false}

here's the full solution

version: "3"
services:
  service1:
    image: docker:dind
    privileged: true
    networks:
      mynetwork:
        ipv4_address: 172.16.0.2
    volumes:
      - ./daemon.json:/etc/docker/daemon.json
    environment:
      - DOCKER_TLS_CERTDIR=
      - DOCKER_HOST=tcp://172.16.0.2:2375
    # Add additional configurations for service1 if needed

networks:
  mynetwork:
    driver: bridge
    ipam:
      config:
        - subnet: 172.16.0.0/24

with the following daemon.json:

{
  "tls": false
}

chiptus avatar Jul 04 '23 06:07 chiptus

You can also override the command used. In our case, the GitLab service looks like this:

services:
  - name: docker:dind
    # Good
    command: [ "dockerd", "-H", "tcp://0.0.0.0:2375", "--tls=false" ]
    # Bad
    # command: [ "--tls=false" ]

Without overriding the whole command being executed (not only the arguments), dind would add the artificial wait time.

maciej-gol avatar Jul 19 '23 12:07 maciej-gol

You can also override the command used. In our case, the GitLab service looks like this:

services:
  - name: docker:dind
    # Good
    command: [ "dockerd", "-H", "tcp://0.0.0.0:2375", "--tls=false" ]
    # Bad
    # command: [ "--tls=false" ]

Without overriding the whole command being executed (not only the arguments), dind would add the artificial wait time.

worked, Thanks

preetsindhal avatar Jul 23 '23 14:07 preetsindhal