buildkit
buildkit copied to clipboard
`COPY --link` not compatible with `chown`
Hello,
COPY --link is really great but I think it doesn't work yet with chown which makes inefficient layers because you have to chown manually after copying or just keep things simple with root.
Will COPY --link be compatible with --chown in the future?
There are some cases of chown that can't be implemented at all described in https://github.com/moby/buildkit/issues/2414#issuecomment-943645942
--chown with UID can be implemented and should be already. Do you see a different behaviour?
For --link --chown=user we could define that the user resolution happens in the context of the base image and not the previous layer. But that is not implemented atm. Not sure how correct that would be. It would still mess up the cache if you have inherited stages like.
FROM ... AS aa
COPY --link foo bar
FROM aa
COPY --link --chown=foo bar baz
Also note that some cases could be solvable with an intermediate stage. You can copy to the intermediate stage with --chown and then to final stage with --link. Your first run isn't as efficient of course but that should mean that your final layer still has the improved independent cache.
FYI @sipsma
Note: I stumbled upon this when searching for issues similar to my own; not sure if this is the appropriate place, let me know if I should open a separate issue.
When migrating some of my Dockerfiles to use COPY --link, I noticed some builds failing locally when combining it with the --chown flag, resulting in a message invalid user index: -1. However, this only happens in my local environment (Windows 11 + Docker Desktop 4.15.0), while building the same image in my CI pipeline using the docker:cli and docker:dind images works perfectly fine.
Is this some sort of intended behavior, where --link and --chown might not play nice together? Or is it limited to Docker Desktop on Windows somehow failing to resolve the uid/gid from the previous layer? Is there any difference to the implementation in Docker Desktop that prevents it from working there? The docker/buildx versions seem to be up to date.
A minimal example Dockerfile in which I face this issue is (using DOCKER_BUILDKIT=1 docker build .):
# syntax=docker/dockerfile:1
FROM php:fpm-alpine
COPY --link --chown=www-data:www-data ./foo /tmp/
Nevermind, misunderstanding
Is this some sort of intended behavior, where
--linkand--chownmight not play nice together? A minimal example Dockerfile in which I face this issue is (usingDOCKER_BUILDKIT=1 docker build .):
That command and Dockerfile work fine for me on Linux. It's only an issue (same error message you got) with buildx when I use docker-container driver instead of docker driver. Docker Engine: 20.10.22, buildx: 0.9.1.
I'm not familiar with Docker Desktop on Windows or usage on Windows in general. Perhaps it's using buildx ~~with the docker-container driver~~ even with your regular docker build?
Gotcha: Current Docker Engine releases silently ignore --link due to bundling old BuildKit release (almost 2 years old)
EDIT: Nope nevermind, I assumed --link was working but it was ignored and --chown was only applied.
Even though Docker 20.10.22 was released in Dec 2022, it's still shipping with BuildKit v0.8.4 apparently (commit from docker --version) which was released in April 2021. In contrast buildx 0.9.1 was released in Aug 2022 and uses a much newer BuildKit version (v0.11.0), but used the Docker Engine bundled BuildKit with the default docker driver/builder :grimacing:
I realized the possibility of an outdated BuildKit when I saw this comment mentioning --link got backported to BuildKit releases prior to v0.10.0 (March 2022) for compatibility by ignoring the option and doing a regular copy without throwing an error.. The BuildKit version used by Docker Engine wasn't obvious to check, but this describes how to fairly well.
Docs / Error could be improved
There are some cases of chown that can't be implemented at all
It'd be great if the official docs made it more clear that --chown and --link are not compatible in this manner, but are with numeric values.
The present error ERROR: failed to solve: invalid user index: -1 could also be more clear about that, especially since --chown=root seems to be accepted.
Added build cache weight via adding an intermediate stage as workaround
Also note that some cases could be solvable with an intermediate stage. You can copy to the intermediate stage with
--chownand then to final stage with--link. Your first run isn't as efficient of course but that should mean that your final layer still has the improved independent cache.
This will increase the size of the build cache by the weight of the intermediate stage though right? (possibly not an issue if data is not too large, or you don't have concerns such as CI cache storage limits)
In my case I was using COPY --link with an external image to copy over 230MB of data which was included into the CI build cache. When adding an intermediate stage with COPY --chown to switch the COPY --link --from to, this doubles storage used, resulting in a cache disk size of 594MB.
EDIT: In this case outputting build cache with mode=min would avoid storing the extra copy as it exports only the whole stage instead of each layer that mode=max does. You'd still incur an extra copy in build cache from the separate stage though.
--chown=user lookup from base image filesystem
For
--link --chown=userwe could define that the user resolution happens in the context of the base image and not the previous layer.
The main issue I have with the current required UID / GID values is that they're not static when they're installed from packages. That depends upon the order the packages are installed in, so when packages are added or removed for example, the values may no longer map to the expected user, so that's fairly fragile.
Regarding this alternative to lookup context from the base image instead, would that allow you to use a base image and install some packages, then use another FROM stage to reference as the base image to get access to the users installed from packages? That works :)
For now I ended up adding a RUN to add a user with an explicit UID before the package is installed (that would normally create the user). I can then provide that UID to COPY --link --chown :+1:
RUN adduser --quiet --system --group --disabled-password --home /var/lib/clamav --no-create-home --uid 200 clamav
# ... install clamav package
COPY --link --chown=200 --from=docker.io/clamav/clamav:latest /var/lib/clamav /var/lib/clamav
Although it's a bit tedious to make sure it's created in the same way as the package does, and varies by base image.
Looking up user-ids in the base-image sounds like a viable option to me. Especially, when we can do multi-layer-Dockerfiles. The only thing to look out for, is that changes in numeric UID are correctly detected when evaluating the respective steps.