caddy-docker icon indicating copy to clipboard operation
caddy-docker copied to clipboard

Consider using libcap API instead of the file effective capability bit

Open fice-t opened this issue 9 months ago • 5 comments

I was testing out some container hardening features and ran into the following error message when attempting to drop all capabilities inside the official Caddy container (e.g. <docker/podman> run --cap-drop=ALL caddy):

caddy[12709]: {"msg":"exec container process `/usr/bin/caddy`: Operation not permitted","level":"error","time":"2025-04-04T19:20:00.696030Z"}

It works fine with only CAP_NET_BIND_SERVICE, which makes sense for the default config. The problem is that the error persists even after configuring Caddy to only use unprivileged ports. Indeed, the container produces the same error even when the command is simply caddy --version!

The error message above comes from crun during container initialization. The root cause appears to be the file effective capability bit set on the executable with setcap.

While normally fine, the problem with the effective capability bit is that execve fails with EPERM if unable to obtain the permitted capabilities (see the section "Safety checking for capability-dumb binaries"). This is the case when --cap-drop=ALL is used in both Podman and Docker.

A solution is to use the libcap API as mentioned by capability(7) above and avoid the use of the effective capability bit. Doing so would allow Caddy to only require the capability when actually needed.

A workaround is to not drop CAP_NET_BIND_SERVICE, which should be fairly safe inside containers. Still, considering that the capability is actually unnecessary in some setups (e.g. socket activation), users should be able to remove it in the default container setup. Also, troubleshooting why this error is occurring even when the binary isn't using the capability is a bit frustrating.

If this proposal is denied, then perhaps the docs should be updated to mention that users may run into this permission error in some environments even when not using any functionality that needs the capability.

fice-t avatar Apr 05 '25 06:04 fice-t

I edited the description and title to be more specific: this issue is not with file capabilities in general, just the file effective capability bit (i.e. the end result would use setcap cap_net_bind_service=+p /usr/bin/caddy instead of +ep).

There is official libcap documentation that describes a sample Go program that uses libcap to manipulate the CAP_NET_BIND_SERVICE capability. Not everything applies (as unconditionally dropping the capability would not always work with config reloading), but moving the capability raising to inside the program allows Caddy to gracefully handle the case when the capability is not present and not needed.

fice-t avatar Apr 05 '25 19:04 fice-t

@mholt While this issue does affect the docker image (involves removing the e from the RUN setcap instruction), it depends on changes in the Caddy source code, so I think the main Caddy repo is the better place for this.

Also, other environments besides containers can run into this, such as systemd services with the CapabilityBoundingSet property. For example, when using a distro-packaged caddy with a unit file or systemd-run:

$ sudo setcap cap_net_bind_service=+ep $(which caddy)
$ sudo systemd-run -p CapabilityBoundingSet= caddy version

The above runs normally without CapabilityBoundingSet, but with the property it results in: run-r8b81194e9a5e4e02aa78be767f19d14b.service: Failed to execute /usr/bin/caddy: Operation not permitted

fice-t avatar Apr 10 '25 22:04 fice-t

What changes to the source code are needed?

(Sorry, I'm not a Docker user, so I'm not very familiar with it. Someone else will have to take this up.)

mholt avatar Apr 15 '25 15:04 mholt

For anyone utilizing caddy:builder to add plugins for example you can pretty easily implement a workaround by adding RUN setcap cap_net_bind_service=+p /usr/bin/caddy to the end of the Dockerfile.

broizter avatar Jul 06 '25 15:07 broizter

What changes to the source code are needed?

To Caddy itself? Just a small bit of Go to raise the linux capability from the permitted set to the effective set. I'm not familiar with Go but it was very simple to do in Rust, so it should only be a couple lines AFAIK.

I have a PR at CoreDNS that heavily documented this same concern with plenty of information to reference there (bar Go code), they've not taken further action since though.


Presently it's been added to the image via setcap applying +ep which is officially known as "capability dumb" (the program is not capability aware).

  • You should only need +p IIRC which communicates the process is permitted (capability approved for process) to use that capability and then at runtime it would be raised to effective (capability granted to process).
  • The +e enforces this at init before the process is started instead via a kernel check, where if the capability is not present (such as dropped like shown above), then the program will not be permitted to execute and it'll fail (as shown with Operation not permitted). If that capability was in the bounding set (such that it remains in the permitted set), then it would be implicitly raised to the effective set.

All you'd have done instead of that "capability dumb" approach, is have your runtime code raise to effective instead when that capability is actually needed.

  • If Caddy were to bind to a port that is permitted (I forget the sysctl for that off the top of my head), then no action is required and everything works fine.
  • While Docker (and I believe Podman now?) default (at least with rootful containers) to no port restriction, Kubernetes has not yet made this change last I checked (so IIRC, in that case the default requires the capability of any port below 1024). Caddy realizes this and attempts to raise the capability into the effective set for it's process (this is like a single syscall to perform), and if it fails to do this (eg: because the capability was dropped by an admin) then Caddy would output a helpful error about the issue, rather than Operation not permitted.

(Sorry, I'm not a Docker user, so I'm not very familiar with it. Someone else will have to take this up.)

So just to be clear, this isn't Docker specific. It's linux specific, it's just easier to encounter with Docker because some projects like to distribute their software with container images relying on a non-root user (instead of directing users to rootless containers for security benefits instead), and as a workaround for that they resolve capability issues like this one via setcap +ep which they only apply to their container packaged build rather than all builds (where a more helpful failure about the action such as failure to bind to a port due to permissions would be presented instead).

With CoreDNS I tried to push forward a proper fix there as ironically a security conscious deployment with an image running as root (when you know what you're doing) where you drop all capabilities would prevent running the software, even when you're not using a feature that required the capability in the first place 😕 (Caddy almost ran into this concern too with an HTTP/3 feature, where the capability to grant would reduce security to those that don't even use the feature)


UPDATE: CoreDNS related PR (which relies on functionality from a Caddy v1 fork for network binding) now has a detailed example of systemd-run and docker run examples, along with a basic program to demonstrate the issue and how to properly handle and resolve it at runtime (similar to the suggestion from the author of this issue with libcap API 👍)

polarathene avatar Nov 03 '25 10:11 polarathene