Unmountable secrets
Tell us about your request A configurable mechanism to have Docker unmount secrets when they're no longer needed.
Which service(s) is this request for? Docker Swarm
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? Currently, secrets are left mounted in the container for the lifetime of the container. This leaves secrets vulnerable if a container is compromised.
Additional context A possible solution would be to add an unmount section to the long-form secrets definition in a service. An example might be:
services:
frontend:
image: example/webapp
secrets:
- source: server-certificate
target: server.cert
unmount_on:
condition: service_healthy
timeout: 1m
One umount clause might be sufficient, but unmounting on first matched clause would also make sense.
One issue is if a process reads a secret, but leaves the file open. This would presumably cause the unmount to fail. I'm not sure if this should just be logged as an error, or treated as a fatal event since the container would not be in the expected state with a secret left mounted.
You’re right that currently, Docker Swarm keeps secrets mounted for the container’s lifetime, which can increase risk if the container is compromised. The idea of having a configurable option to unmount secrets based on conditions (such as when a service is healthy or after a timeout) makes a lot of sense for strengthening security.
Since this would likely require changes to both Docker’s secret management logic and the Swarm service definitions, I have a couple of thoughts and questions for moving forward:
Design Discussion: Would you be open to collaborating on a more detailed design proposal?
For example, we could draft how the unmount_on clause would look in the Compose spec, and outline expected behaviours for cases like open file handles.
Prior Art: Are there any similar features in other orchestrators (like Kubernetes) that we could reference for inspiration or implementation ideas?
Implementation Steps: If maintainers think this is feasible, we could break it into smaller steps (like updating the Compose spec and then the Docker engine code) to make it easier for contributors to help out.
If you’re interested, I’d be happy to help start a draft PR for this.
I'd be happy to write up a first pass at a my thoughts for a config section with options and behaviors and work together on a design. I'm not a Docker expert, but I can already see questions around a mini secrets lifecycle inside the more complex container lifecycle. I can also take a look at other implementations.
You’re right that currently, Docker Swarm keeps secrets mounted for the container’s lifetime, which can increase risk if the container is compromised. The idea of having a configurable option to unmount secrets based on conditions (such as when a service is healthy or after a timeout) makes a lot of sense for strengthening security.
I'm not sure if unmounting would help much here; swarm tasks are re-created when the container exits; while it would help somewhat if that specific task was compromised, it would not if the compromised container would (e.g.) persist changes (which could be through storing the payload in a volume that gets re-attached); crashing the container to make a new instance be created would now get access to the secret again. Depending on how it's used, the secret itself would either still be there (e.g. in memory), or code using the secret still being compromised.
I think the only real solution for such things would be to either have a "one-time" secret that gets rotated on first use and/or "dynamic" secrets, where the secret is not a static value but (e.g.) some fuse mount.
Some of what you list are application level decisions, which are security vs functionality. But "some applications do insecure things" shouldn't be an argument for leaving unencrypted secrets laying around. As for crashing the container, the idea is that automatic unmounting would leave a small window for what would have to be crash -> restart -> re-compromise service -> obtain secret. I view Docker Secrets as the simplest possible secrets manager, the automatic unmount is trying to minimize exposure while maintaining simplicity.
A fuse mount would be interesting, but would fall outside docker secrets altogether. One time use passwords is also interesting, but I think that would require some form of hook within Docker to repopulate a new password on service start/restart. And this could cause contention with one secret being shared between replicas and services when a system is starting up or recovering from an event.
One other solution I thought of that would be simple and (maybe) a small change would be to optionally use a named pipe for the secret. With this, the Docker controller would write the secret to the pipe, and it would then be inherently read-once from within the container. This provides the absolute minimum amount of time a secret is accessible, and an application finding an empty secret file is a big red flag. It would even catch if secrets are being read from a compromised host. The issue would be that pipes are blocking, so writing becomes more complicated.
A quick test shows that pipes are at least viable:
$ mkfifo pipe
$ docker run --volume ./pipe:/tmp/pipe alpine:latest cat /tmp/pipe &
$ echo hello world > pipe
$ hello world
[1]+ Done docker run --volume ./pipe:/tmp/pipe alpine:latest cat /tmp/pipe
Here's my current thoughts on the unmount section:
services:
service:
secrets:
- ...
unmount_on:
condition: (service_started|service_healthy)
delay: duration
on_failure: (ignore|terminate)
The unmount_on handler causes Docker to unmount a secrets file from a container given the container's condition and an optional delay. Common configurations would be condition:service_started with a fixed delay, or condition:service_healthy (as defined by a healthcheck section) with an optional additional delay. If the unmount fails, on_failure determines what happens to the container. If set to ignore, the container will continue running with the secrets file available, terminate means the container process will be terminated and be subject to any restart policy.
I went back and forth with unmount_on/condition and unmount/require_healthy. The condition style allows more flexibility if there are any additional states in the future. I also considered including retries/retry_interval, but ultimately rejected it. There are reported issues with unmounts failing in containerd, one example is Microsoft's security scans. containerd already has a default unmount retry policy of 50 retries with 50ms sleep time, so I think simply allowing delay is enough if there are non-random issues.
For default values, I would suggest terminate be the default for on_failure. Unmount failure should be a rare case, and I think leaving the container in an unexpected security-related state is bad. There's no inherent defaults for condition/delay though, since the defaults would be dependent on the presence of a health check. Also this adds a potential configuration error check, setting condition: service_healthy would require a valid healthcheck.
I'm putting this configuration in the association of a secret to a service. I can see utility in having this defined in the secrets top-level section as default behavior. However it would add complexity and it differs from the current setup where uid/gid/permissions are defined only under the service.
Additionally, it might be good to have a note using unmount_on is different from docker service update --secret-rm. The service update modifies the configuration at the service level and restarts the service without the secret, while unmount_on is a per-container post-initialization step.
Issues:
The main issue is: is this request actually feasible as written? It seems containerd is responsible for assembling and tearing down the filesystems, while Docker configures the filesystem and runs the health checks. Therefore, this would need to be implemented as a request from Docker to containerd via CRI to perform the unmount, which is a much more far reaching change than I originally envisioned. I'm not seeing any support for runtime volume unmounting (or mounting) in the CRI definition.
Another issue is that even if the file is unmounted from the container, it's still available in a read-only tmpfs mount on the host. It would be preferrable to clean that up as well, which would mean remounting read/write, removing the secret file from the tmpfs mount, and remounting back to r/o.
Should this change affect docker service create/docker stack deploy? This adds a new reason for containers to exit, similar to healthchecks never reaching the healthy state. Service creation would have to wait for all secrets with on_failure:terminate set to be unmounted before deciding a container/service is stable.
For looking at other orchestrators, I looked at Kubernetes, Podman, and Azure, none of which support unmounting secrets. But the more important thing is that the CRI standard doesn't appear to support unmounting.