feat: Refactor `Dockerfile`
Summary
TL;DR:
- 4GB disk usage down to 1.7GB (some extra weight externalized via cache mounts).
- 68MB binary now builds to smaller 24MB size.
- Layer invalidation concerns reduced, especially with use of cache mounts (less meaningful to CI, but good for local use).
- Additional changes detailed below.
- The first commit introduces the larger diff as adopting HereDoc feature introduces indents and merging of some
RUNlayers.- I've also relocated and grouped some directives where appropriate, minimizing layer invalidation.
- HereDoc usage brings the added benefit of better formatting for multi-line
RUNcontent. - There doesn't seem to be a need for the early
--targetuntil explicitly adding the target triple.
- 2nd commit (will follow shortly) addresses excess disk weight for the builder image layers that's less obvious from the multi-stage build.
-
When ignoring the final stage, the total disk usage for the builder image is about 4GB.
Expand to see breakdown (4.11GB image)
- 80MB => Ubuntu base image (~116MB after apt update)
- +450MB => apt update + system deps
- +13MB => pip
/root/.venvcreation - +390MB => pip install cargo-zigbuild
- 306MB
/root/.venv/lib/python3.12/site-packages/ziglang - 79MB
/root/.cache/pip/http-v2(can be skipped withpip install --no-cache-dir) - 2.7MB
/root/.venv/bin/cargo-zigbuild
- 306MB
- +1.3GB => rustup + toolchain + target setup
- 15MB rustup install
- 1.3GB
rust toolchain target add, (Over 600MB is from toolchain installing docs at~/.rustup/toolchains/1.78-x86_64-unknown-linux-gnu/share/doc/rust/html, theminimalrustup profile doesn't apply due torust-toolchain.toml)
- +1.8GB => Build cache for
uv:- 270MB
/root/.cache/zig from cargo-zigbuild - 300MB
/root/.cargo/registry/(51MBcache/, 21MBindex/, 225MBsrc) - 1.2GB
target/dir- 360MB
target/release/{build,deps}(180MB each) - 835MB
target/<triple>/release/(168MBbuild/, 667MBdeps/)
- 360MB
- 68MB
uv(copied from target build dir)
- 270MB
-
The
RUN --mountusage allows for caching useful data locally that would otherwise be invalidated across builds when you'd rather have it available. It also minimizes the impact of unused layers from past builds piling up until a prune is run. This is mostly a benefit for local usage rather than CI runners. -
With the fix for the rustup profile and proper handling of cacheable content, the image size falls below 1.6GB and iterative builds of the image are much faster.
-
- The 3rd / 4th commits simplify the cross-platform build. There isn't a need for the conditional block to install a target.
- Since the image is to be built from the same build host (
FROM --platform=${BUILDPLATFORM}), you can just install both targets in advance? Then useTARGETPLATFORMARG to implicitly get the preferred platform binary build. - Now the
builderstage shares all the same common layers, instead of diverging at theTARGETPLATFORMARG. This is viable due tocargo-zigbuildsupporting cross-platform builds easily.
- Since the image is to be built from the same build host (
- The 5th commit tackles the inline comment about optimizing the build size of the binary, reducing it from the current 68MB to 24MB, better than the current
uvbinary distributed via other install methods. You can use UPX to reduce this furth from 23.6MB to 6.7MB (<10% current published size for the image), at the expense of 300ms startup latency and potentially false positive to AV scanners.
Test Plan
I've built the image locally, you still get the binary output successfully. Let me know if you'd like me to test something specific?
# Using Docker v25 on Windows 11 via WSL2, with docker-container driver builder:
$ docker buildx create --name=container --driver=docker-container --use --bootstrap
# Excess platforms omitted:
$ docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
container * docker-container
container0 unix:///var/run/docker.sock running v0.13.2 linux/amd64, linux/arm64
# Building with both platforms and outputting the image contents from both platforms:
$ docker buildx build --builder=container --platform=linux/arm64,linux/amd64 --tag local/uv --output type=local,dest=/tmp/uv .
# ... Build output ...
=> exporting to client directory
=> => copying files linux/arm64 20.98MB
=> => copying files linux/amd64 23.67MB
# Exported contents:
$ ls /tmp/uv/*
/tmp/uv/linux_amd64:
io uv
/tmp/uv/linux_arm64:
io uv
# AMD64:
$ du -h /tmp/uv/linux_amd64/uv
23M /tmp/uv/linux_amd64/uv
$ file /tmp/uv/linux_amd64/uv
/tmp/uv/linux_amd64/uv: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
# ARM64:
$ du -h /tmp/uv/linux_arm64/uv
20M /tmp/uv/linux_arm64/uv
$ file /tmp/uv/linux_arm64/uv
/tmp/uv/linux_arm64/uv: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, stripped
# Equivalent output to running `docker run --rm -it local/uv`
$ /tmp/uv/linux_arm64/uv
Usage: uv [OPTIONS] <COMMAND>
Commands:
pip Resolve and install Python packages
venv Create a virtual environment
cache Manage the cache
version Display uv's version
help Print this message or the help of the given subcommand(s)
Options:
-q, --quiet Do not print any output
-v, --verbose... Use verbose output
--color <COLOR_CHOICE> Control colors in output [default: auto] [possible values: auto, always, never]
--native-tls Whether to load TLS certificates from the platform's native certificate store [env: UV_NATIVE_TLS=]
-n, --no-cache Avoid reading from or writing to the cache [env: UV_NO_CACHE=]
--cache-dir <CACHE_DIR> Path to the cache directory [env: UV_CACHE_DIR=]
-h, --help Print help (see more with '--help')
-V, --version Print version
Tagging @konstin for this one.
Hi, and thanks for looking into our dockerfile! I have two main concerns with the changes:
Docker is a secondary target for us (most users are using the curl, powershell or pip installer), so we'd like to keep it simple to maintain. The main audience are projects with a fully dockerized CI pipeline who want to run ruff or uv as a docker step; to people creating docker images we recommend the pip or curl installers inside the dockerfile. We don't intend for users to build the uv container themselves, so we haven't optimized for builder size. There is one big exception though: Building the image for the release job is the bottleneck in our release process, if that could be improved that would be amazing!
Performance is crucial for us, and we optimize performance over binary size.
TL;DR: I'm happy to adjust the Dockerfile if you'd still like to keep the build process in there.
I can look into optimizing the workflow to bring down that bottleneck time with compilation, but I suspect the pragmatic approach might just be to take an existing uv build from CI and publish an image with that?
so we'd like to keep it simple to maintain.
Sure! I can make an adjustment that prioritizes that over the other optimizations 👍
There is one big exception though: Building the image for the release job is the bottleneck in our release process, if that could be improved that would be amazing!
~~Could you please elaborate a bit more here? Is it with Github Actions and the time it takes to build the image?~~
I looked at the recent workflow runs and see the bulk of the time is spent compiling amd64 + arm64 stages concurrently, taking about 20+ minutes to complete.
As long as future runs wouldn't trigger compiling everything all over again if you had the target/ dir for each platform cached, that'd speed up that step quite a bit.
However, you're already building uv with these binaries outside of Docker in CI? Why not just use FROM scratch + COPY like the final stage of the current Dockerfile, skipping the bottleneck concern?
I prefer having a Dockerfile as an easy way to build a project and not worry about dependencies or other setup, but since you don't seem to have an audience for that and the build step is purely for your CI... might as well skip all this since you want to prioritize keeping maintenance simple? (unless the build stage is useful to maintainers or users outside of CI?)
However, you're already building uv with these binaries outside of Docker in CI? Why not just use FROM scratch + COPY like the final stage of the current Dockerfile, skipping the bottleneck concern?
We would like to retain the isolation of the Dockerfile for building the image. The build stage isn't used by us otherwise, we just want to publish the final image, everything else goes through cargo build --release or maturin build --release. Anything that makes the build faster (incl. caching) would be great. Note that we update dependencies often, so Cargo.lock doesn't work as a cache key unfortunately
I've tried to summarize the considerations in the TLDR, let me know which direction you'd like to take.
TL;DR:
- The build stage in the
Dockerfilecould be kept for potential use elsewhere, but an alternative stage that is easy to maintain could be used purely by the release CI to provide existing release binaries already built, completely removing the bottleneck. - Otherwise, to optimize the cache you need to only cache what is relevant but presently you cache image layers that aren't helping that much and cannot cache the data of interest due to layer invalidation. You will need the cache mount syntax or similar approach to restore
target/from cache. sccacheis probably needed for the above cache optimization untilcargomoves away frommtimebased invalidation (which makestarget/cache likely ignored).- Splitting the image builds by target to separate runners and caching the final output is an alternative approach. It'll shift the maintenance complexity to the GH workflow where it needs to merge the two final images into a single multi-platform image manifest (which is currently done implicitly).
We would like to retain the isolation of the Dockerfile for building the image. Anything that makes the build faster (incl. caching) would be great.
Just to confirm that isolation isn't serving any real purpose though? You could use the CI binaries you already built for AMD64/ARM64 if you wanted to right?
If you just want the build stage to remain in the Dockerfile as a useful isolated build environment for reproductions / documentation or something like that, that's fine 👍
- You could still use that stage for testing PRs or anything else in CI if you find a use for it. However for the release workflow, where it's not really needed, you could have an alternative stage that does take an external binary as input.
- As you stated you don't really have any actual need or value with the build stage for the release as you're only publishing an image with the binary, you might as well use the same one that you distribute elsewhere?
At least that is just the more pragmatic / simple approach to take and doesn't complicate maintenance. To optimize the build stage for CI will require cache optimizations such as with the cache mounts I showed earlier, otherwise the layer that creates target/ during build is invalidated and this data is discarded... which has no benefit to cache import/export like you presently do.
Anything that makes the build faster (incl. caching) would be great. Note that we update dependencies often, so Cargo.lock doesn't work as a cache key unfortunately
Cargo.lock doesn't matter much here, it's the target/ dir you want to cache, not the crates source. You can optionally create a cache by any content/input you'd like to, and when that has no cache hit GH will use the latest cache item instead IIRC.
You're looking at 1GB roughly after compilation. That can be compressed fairly quickly with zstd down to 300MB, there isn't much value in higher compression levels (the tradeoff is barely worth it for 50MB-ish less). So 600MB for both targets to build with. There is little value in caching anything else in the image itself, the toolchain and crates all install fairly fast that you're just wasting the 10GB cache storage available by doing that. I would only cache what is useful and would assist with the speedup.
I haven't tried sccache, but this might be worthwhile to look into. A key problem with builds in CI is that git clone doesn't preserve mtime attribute on files, yet cargo relies on that for it's own "cache key", thus even with the target/ restored, since the source files have different mtime I think it'll still perform quite a bit of unnecessary compilation. By leveraging sccache instead, I think we can avoid the overhead that introduces.
Finally one other trick you can use is to split it across two runners, one for each target. In your case this works quite well due to the minimal image you actually care about. If the cache upload is only your final scratch image with just the binary, then you can quite easily leverage cache to retrieve both platforms (using the release tag as the cache key for example) and have these two separate images combined into a multi-platform manifest (which is what's already happening implicitly). The added complexity here splits the bottleneck of 20+ min build time in half, so that it more closely aligns with the other jobs running. You may not need this optimization though if that sccache approach works well enough.
Splitting the image builds by target to separate runners and caching the final output is an alternative approach. It'll shift the maintenance complexity to the GH workflow where it needs to merge the two final images into a single multi-platform image manifest (which is currently done implicitly).
Splitting into two separate jobs would be a good idea.
I don't have experience with caching docker builds beyond layer caching, if you know a way i'm happy to use that.
You could still use that stage for testing PRs or anything else in CI if you find a use for it. However for the release workflow, where it's not really needed, you could have an alternative stage that does take an external binary as input.
We only build the docker container for release, we don't use those images ourselves otherwise.
You're looking at 1GB roughly after compilation. That can be compressed fairly quickly with zstd down to 300MB, there isn't much value in higher compression levels (the tradeoff is barely worth it for 50MB-ish less). So 600MB for both targets to build with. There is little value in caching anything else in the image itself, the toolchain and crates all install fairly fast that you're just wasting the 10GB cache storage available by doing that. I would only cache what is useful and would assist with the speedup.
The github docs [says]:
GitHub will remove any cache entries that have not been accessed in over 7 days. There is no limit on the number of caches you can store, but the total size of all caches in a repository is limited to 10 GB. Once a repository has reached its maximum cache storage, the cache eviction policy will create space by deleting the oldest caches in the repository.
If we get into a release cadence of once a week the cache will already be evicted by then, i'm afraid this wouldn't work for us.
TL;DR:
- You can build and cache on a schedule (daily) or when
mainbranch is pushed to. Since it's only intended for releases, daily builds might work well? (ideally minimizing cache storage use to not impact other workflows usage) - Your GH Actions cache is somehow at 50GB from the past 2 days, despite the 10GB limit it's meant to impose 🤷♂️
- Splitting the build to separate runners should be simple enough I think.
- Cache mounts to speed up the build may be viable, I'll look into trying it with a test repo this weekend.
- You could alternatively consider
depot.devas they seem to have an open-source tier which may like other services equate to free use. They're apparently quite simple to switch to and claim to speed up image builds considerably (especially ARM64 due to native build nodes).
If we get into a release cadence of once a week the cache will already be evicted by then, i'm afraid this wouldn't work for us.
Technically you could probably have a scheduled build job that just builds and caches the image, or triggers when the main branch is updated (something we have at docker-mailserver, except we publish a preview/beta image). You don't have to publish that way, but it should keep the cache available for an actual release.
That said, I don't know if there's any nice ways to partition/manage the CI cache... and wow, I just looked at your current cache use:
Your largest cache items are about 1GB of which multiple were uploaded within the same hours. So not quite sure what's happening there, perhaps a bit of a bug on GH side. You have one item (2MB) from almost a week ago, while everything else is within the past 2 days, so it's not like it's evicting the old items regularly when exceeding the 10GB capacity limit 🤔
Splitting into two separate jobs would be a good idea.
The way this is handled for docker-mailserver I explained here with some rough examples demonstrated. We have it modularized into separate build and publish workflows, which a much smaller and simpler workflow(s) can then call (GH Actions calls these "re-usable" workflows).
That way you can separately build and cache the two images on a schedule or when commits are pushed to main branch like described above, with the separate publishing workflow triggered only on tagging new releases?
You wanted to keep the Dockerfile simple to maintain, so I'm not sure how you feel about this kind of change for the CI?
I don't have experience with caching docker builds beyond layer caching, if you know a way i'm happy to use that.
I don't have any workflow examples to cite for this as I haven't done it yet myself. There is a GHA cache exporter for BuildKit that can be used.. oh you're already using it here:
https://github.com/astral-sh/uv/blob/0a055b79420fdf1342c278e97d0c78cc487df89b/.github/workflows/build-docker.yml#L64-L65
It was experimental when I implemented the workflows for docker-mailserver. There's a separate action that allows leveraging that cache method when building a Dockerfile that uses cache mounts. Seems like there was a related effort for this approach 4 months ago. This is the action I had in mind for using to support cache mounts.
Alternatively, you could apply to depot.dev for an open-source plan to be granted for UV? They claim to be considerably faster at builds, automate the cache for you and offer native ARM64 hosts instead of slower QEMU emulation we presently are limited with on GH.
So from a maintenance point of view, depot.dev may be a simpler option available to you (assuming like other services like Netlify and DockerHub that offer similar plans that haven't cost docker-mailserver anything to adopt):
The cache mounts seem nice to have but yeah per my earlier attempt in #995 probably not worth it in GitHub Actions. I did consider https://github.com/reproducible-containers/buildkit-cache-dance but it didn't seem worth the dependency and complexity for us. It seems like the ecosystem is lagging a bit here and I've just been waiting for those upstream issues to be fixed. We could consider depot.dev, I'm not sure if we'd get a discount since we're VC funded but the project is open source 🤷♀️ I'm hesitant to administer an external service for something that's not causing us problems right now, but perhaps that's the simplest path forward.
Sorry about the back and forth and hesitant here. This is a tough pull request to review, we appreciate that you're investing time into it but the release artifacts are a critical part of our software. And, since we're a small team, we're careful about taking on more maintenance and services to manage.
This was written last week to respond on the same day but I didn't seem to send it (just came across the tab), I've still got this on my backlog and hopefully can tackle it this weekend instead 👍
Original response
Sorry about the back and forth and hesitant here. This is a tough pull request to review, we appreciate that you're investing time into it but the release artifacts are a critical part of our software. And, since we're a small team, we're careful about taking on more maintenance and services to manage.
No worries! If you'd prefer to leave the Dockerfile as it is, we can close the PR - or I can improve only the Dockerfile and not bother with the caching improvement for the CI in this PR👍
When I revisit this in the weekend I'll take this approach:
- Build the image for each platform (AMD64 + ARM64) on separate runners.
- Upload the final image output only as cache (subsequent builds won't be sped up, but only relevant if you release again within 7 days)
- Next job depends on the prior two completing, brings in their output and publishes the multi-platform image.
Alternatively (reduced complexity, minimal time to publish):
- The CI could verify the build stage is successful, without depending upon it for the binary to publish? This avoids any bottleneck on publish time for the actual image since publishing an image for release doesn't need to block on verifying a build via the
Dockerfile? - A separate
release.Dockerfile(naming convention) could be equivalent to the finalscratchstage but withCOPYadding CI build artifact for their respective image and publish that for a quick release? - This wouldn't change the value of the
Dockerfilewhich still builds self-contained by default. Just a difference in how the CI uses it. Although I could understand wanting the officially published images to actually match theDockerfilein the repo.
Cache Mounts + CI Action
I did consider https://github.com/reproducible-containers/buildkit-cache-dance but it didn't seem worth the dependency and complexity for us.
Anything in particular about the complexity? It adds one additional step into the mix (technically two since you don't have actions/cache in this workflow):
- name: inject cache into docker
uses: reproducible-containers/[email protected]
with:
cache-map: |
{
"var-cache-apt": "/var/cache/apt",
"var-lib-apt": "/var/lib/apt"
}
skip-extraction: ${{ steps.cache.outputs.cache-hit }}
All that's being configured is 1 or more locations in a container to cache, with a cache key to skip if an exact match (actions/cache will download the latest relevant cache found if not an exact match).
On the Dockerfile side, yes their example is more complicated:
FROM ubuntu:22.04
ENV DEBIAN_FRONTEND=noninteractive
RUN \
--mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
rm -f /etc/apt/apt.conf.d/docker-clean && \
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' >/etc/apt/apt.conf.d/keep-cache && \
apt-get update && \
apt-get install -y gcc
This is because they're:
- Using two separate cache mounts.
- Ubuntu images have a post-hook for
apt-installto clean up that cache, thus you need to disable that and adjust configuration.
With ubuntu images sometimes the apt update command can be slow, and for apt install it varies. Easily remedied by using an alternative base image like Fedora, but in your CI this isn't where the bottleneck is.
I can look into this as a follow-up, but I think just caching the sccache directory may help a fair amount? So the Dockerfile adjustment would be in the build stage and look something like this:
RUN \
--mount=type=cache,target="/path/to/cache",id="cache-dirs" \
<<HEREDOC
cargo zigbuild --release --bin "${APP_NAME}" --target "${CARGO_BUILD_TARGET}"
cp "target/${CARGO_BUILD_TARGET}/release/${APP_NAME}" "/${APP_NAME}"
HEREDOC
or as two separate RUN (if that's easier to grok without the HereDoc):
RUN --mount=type=cache,target="/path/to/cache",id="cache-dirs" \
cargo zigbuild --release --bin "${APP_NAME}" --target "${CARGO_BUILD_TARGET}"
RUN cp "target/${CARGO_BUILD_TARGET}/release/${APP_NAME}" "/${APP_NAME}"