docker-nginx
docker-nginx copied to clipboard
Could we remove `curl` from nginx images (again)?
Recently when I removed curl
from our own, derived nginx
images, I was quite suprised to learn the base image actually contains curl
already since the upgrade to 1.18.0. This was added as a fix for #378
I was trying to remove curl
from our images, as there is no really good reason for it to be in a web server image in the first place (we were previously using it for health checks only), and it also triggers a range of security scanning reports with High severity. See for example a snyk report for current version. This has actually been reported as an issue also: #657
I am aware that those are false positives, but I would prefer a basic webserver image to come "as clean as possible" - now everybody scanning any derived image for vulnerabilities will have to deal with this in some way or the other.
But while it's easy to add curl
(or wget
for the sake of it) to an upstream base image like your nginx
images, it is not actually possible to my knowledge to remove it. Yes, we can apt-get remove
it and thus hide it, but it has still been installed as one of those immutable layers in the image...
Note that the same issue was actually also brought up by @hsblhsn for the unprivileged image here, but closed for the time being.
In fact the unprivileged image is also my main focus, but as those two images are closely related, I use both and this one here gets more attention, I figured I'd rather have this discussed here 😉
So could we maybe reconsider this decision?
Hi @sambernet and thank you for opening this issue.
I took quite some time to consider that proposition and I'm still not in ease with removing curl
for a couple of reasons.
First, surely it will help in lowering the amount of CVEs as found in the images, but it will definitely not make them disappear altogether - e.g. I would estimate the curl vulns to be around 30% of what we see? And there are more libraries that just keep on giving, like libxml2 or libgd2 that are included in the image due to the additional modules.
Second, it's been more or less a way of doing things for nginx
project as a whole to be consistent and almost never break backwards compatibility for features we provide. Thus removing curl from the images would break that assumption and make life a bit harder for those who relied on it since a few years.
With that in mind, and given the fact that you would prefer, quote, a basic webserver image to come "as clean as possible", would a new minimal/slim/whatever image (w/o curl, or any additional nginx modules) be acceptable for the use-case?
I understand that this is an additional burden for both docker-library and unprivileged images maintainers, so maybe we can limit that image to be only available on amd64 and aarch64, the most popular architectures? Should we limit it to alpine-only (given the minimality concept)?
I would really like a community input on this one.
cc @tianon @yosifkit @alessfg
I agree that a minimal/slim Alpine image might make sense. We could keep the size and contents to the bare minimum essentials for nginx
.
Re architectures, if we can, I would probably build for all the same architectures that we build the standard Alpine images for. From a user perspective, I would not be happy if a new minimal image were to be released and it happened to not work in my current environment just because it's not one of the most common architectures out there.
From the perspective of just image size, I'm not convinced -- installing curl
and deps in Alpine is ~2MiB total. :sweat_smile:
Regarding architectures, a good middle-ground (that would also lessen the load on build servers) could be to publish only for the architectures NGINX Inc actually builds for (https://github.com/nginxinc/docker-nginx/blob/d4a47bc6602d3a1412dad48a8513b83805605ef3/mainline/alpine/Dockerfile#L30-L43), but I would stress again that I'm not convinced this really adds much over the existing alpine
variant :see_no_evil:
Security scanners being over-zealous seems like a good reason to put pressure on the scanner vendors IMO (rather than trying to remove otherwise useful tools with negligible impact otherwise). For example, Trivy gives a much more reasonable list (although it lists CVE-2022-30065 for nginx:alpine
even though there are no packages which even can be updated in the latest nginx:alpine
image).
I agree that from a size perspective there wouldn't be too much of a gain, but I think it wouldn't hurt to have a version of the Alpine image that only installs the bare minimal essential packages.
Re architectures, I stand behind my original comment that more architectures would be better from a general user perspective, but I don't have a strong opinion. We could start by building for the architectures we already build for, like you mentioned, and if there ever is enough demand for other architectures, we can always revisit the list.
I 100% agree that security scanners seem to be over-zealous and basically report any and all vulnerabilities without digging into whether the vulnerabilities can be updated or (my personal pet peeve) whether the vulnerabilities actually affect the given image in any significant way. Maybe as the various security scanners out there get further developed they will improve their introspection capabilities.
I just noticed curl myself when I went to fork this image for our internal builds. I also would make use of a slim image over the current alpine image. I am fine with it being limited to Alpine builds on amd64 and aarch64.
I've pushed an implementation of a slim image based on alpine in https://github.com/nginxinc/docker-nginx/commit/1dca42f99b3f032d862a1d35e8a5b951d629dc98. It's not published on the hub yet, so you'll have to build it yourself to try it out. The main differences are only nginx and tzdata are installed, no curl, ca-certificates, or any nginx modules are baked in. Let me know if it lacks something or ships something we don't want - we have time until a next nginx release (a couple of weeks or so, I guess).
Thanks!
@tianon to be honest, I don't really care about image size (I even rarely use alpine images, just to avoid musl/glibc compatibility issues in the first place) 😉
My goal is generally rather to remove everything that is not going to be used from the systems we actually run in production. It's very easy to add curl
to any given image, but impossible to properly remove (by design). On a side-note: We run images with curl (and sometimes openssl
and ca-certificates
) explicitly removed using apt-get purge
as part of our own docker images.
I agree security scanners should analyse a bit deeper than just plain file contents - but then again: not having a dependency in the first place has a much lower attack surface than having it but "not using it"... it might still be used for privilege escalation and lateral movement if it's there.
Regarding the architecture discussion I agree with @alessfg, as I use the same image also on my personal PIs which are armv7 🙈
@thresheek I know I'm a bit late already, but I will try and have a look at your "slim" image variant those days (I mostly use the unprivileged image version, but I think we still run some instances of this original image as well).
It took me a little longer to verify this, as I had to convert our images from debian to alpine base image first (I had a range of bash
entrypoint scripts that I had to make POSIX-compliant along the way).
Now with that sorted out:
- @alessfg: my main use case is really the unprivileged image (we also deploy to OpenShift). I tested the slim image variants you already published, specifically the
1.23.1-alpine-slim
tag version. I had no issues with this version whatsoever. Also Snyk reported 0 vulnerabilities on my resulting images. :tada: - @thresheek I tested your slim version (for my use case of the original image) and it works perfectly fine so far as well. I also specifically tested a reverse proxy use-case with SSL termination on the proxy. I would be glad to have this version published as an official tag ❤️
Edit: typos
Since our privy-containerscan triggered on a recent curl CVE (https://security.alpinelinux.org/vuln/CVE-2022-42915)
we tried to remove curl via apt del curl
and this seemed to stop the containerscan from complaining. However, what I don't understand, is this we actually remove curl from the container, or did we just disable it, as TS/OP writes?
Hi Michael,
Its technically still in the image. If you open the resulting image in the tool called dive, it allows you to explore the layers of an image.
This post is a good, detailed write up on docker layers and should clarify any lingering questions you might have.
I hope this helps!
Fixed now in 1.23.4, 1.22.1 with the appearance of slim images.
Thanks again to everybody that was part of this effort, especially @alessfg, @tianon and of course @thresheek 🚀❤️
Even though this wasn't really a goal, I like the positive effect this has on resulting image size:
-
1.23.4-alpine3.17:
16.01MB
forlinux/amd64
-
1.23.4-alpine3.17-slim:
4.94MB
forlinux/amd64
That's a nifty image size decrease by a factor of ~3.3!
Even though this is not primarily due to not adding curl
, the new slim
version gives a real sleek base image 😁 🎉