`clickhouse` image: questions to make it slim, remove unnecessary dependencies
Hello, we have a task in our backlog to strip the image down to no dependencies together with @santrancisco. It's planned to be a separate tag with no entrypoint.sh at all. The clickhouse internals will be managed manually by mounting config and data files inside.
Now, so far, there are a few potential ways we're investigating, and I'd like to ask what would be an approved one by you.
-
FROM scratchwith copying things from the glibc-donor, how it was done in https://github.com/ClickHouse/docker-library/blob/ef22522b5e70f64e6ea359f21ce091581bc03061/server/25.6.12.10/Dockerfile.alpine#L23 and wasn't approved forFROM alpineimages -
FROM busybox:glibcwith the same approach
Before starting to dig deeper, we'd like to have feedback on what the preferred way is.
Thanks for reaching out first! :heart:
I think I'd prefer to see FROM scratch, ideally COPY --from=clickhouse:x.y.z. :eyes:
(busybox:glibc isn't actually intended for this use case, although it is a popular way to use it :sweat_smile:)
Expanding on the prior Alpine example in case it's helpful, that was because of the mixing of multiple (shared) libc in a single image, and an explicit request from the Alpine project that we disallow that in otherwise ostensibly "Alpine" images (because of the breakage it tends to cause, and the reports they then often receive about it). :+1:
Now it's clear regarding alpine, thanks!
If we'd go for the direction COPY --from=clickhouse:$VERSION ..., can we define it in the same LDF?
For example, the tag 25.8.2.32 and 25.8.2.32-slim would be built in the same scope. Can the bashbrew build the dependencies, so 25.8.2.32-slim would be used only after the first is ready?
Yep, that's exactly how the system is designed to work (except across all images in the program, not just within a single repository). :eyes:
See https://github.com/docker-library/tomcat/blob/99b7e90b0a58c8cf2990958971c8adac3c6a4f57/11.0/jre21/temurin-noble/Dockerfile#L22 for an example of exactly this pattern (copying the compiled "Tomcat Native" artifacts from Tomcat's JDK images into the corresponding JRE images). 👍
Thanks!
We'll work in this direction
Another potential security improvement, we want to run the containers as non-root clickhouse user by default. I see other DB images don't do it, but some images do. https://github.com/ClickHouse/ClickHouse/pull/89718/files#diff-430817660f9a8f75103a9617b8ab1a9a38a14fef5f22c50b301c2a0a06d89875, added USER clickhouse. And there is a documentation change below as well.
Won't it cause any blocking?
That's a choice that's up to you -- we generally avoid it because it means the image is slightly harder to use with roughly the same end-result (for example, that means that filesystem permissions cannot be auto-fixed by the image before stepping down from root and thus are completely the user's responsibility), and either way users can always use --user if they want/need this security ratchet themselves. :+1:
Thanks, it's a valid input. I'll experiment more with different cases to find if it's a problem or just a slight inconvenient.