buildkit icon indicating copy to clipboard operation
buildkit copied to clipboard

Not possible to mount paths that are excluded by dockerignore

Open thaJeztah opened this issue 5 years ago • 9 comments

This may be (somewhat) expected, but thought I'd open a ticket, because I can see use-cases where this functionality would be useful.

Description

I'm trying to exclude paths in the build-context (through .dockerignore), to prevent those paths from being included in the image that is built. However, some steps make use of the excluded files, and to provide access, I'm using RUN --mount, to "overlay" the excluded files.

Prepare

mkdir excluded_mount && cd mkdir excluded_mount

mkdir -p assets src
touch assets/some-file.txt src/some-source-file.txt

cat > Dockerfile <<EOF
#syntax=docker/dockerfile:1.2

FROM busybox
WORKDIR /project
COPY . .

# Mount the assets directory, and recursively show all files in the project
# directory. Exit with a non-zero exit code, so that the results are printed.
RUN --mount=source=/assets,target=/project/assets ls -lR && exit 1
EOF

Without dockerignore

Build the Dockerfile, and notice that the assets directory is successfully mounted

$ DOCKER_BUILDKIT=1 docker build --no-cache .

[+] Building 2.7s (10/10) FINISHED
 => [internal] load build definition from Dockerfile                                     0.2s
 => => transferring dockerfile: 181B                                                     0.0s
 => [internal] load .dockerignore                                                        0.2s
 => => transferring context: 2B                                                          0.0s
 => resolve image config for docker.io/docker/dockerfile:1.2                             1.1s
 => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2f... 0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                        0.0s
 => [1/4] FROM docker.io/library/busybox                                                 0.0s
 => [internal] load build context                                                        0.1s
 => => transferring context: 303B                                                        0.0s
 => CACHED [2/4] WORKDIR /project                                                        0.0s
 => [3/4] COPY . .                                                                       0.2s
 => ERROR [4/4] RUN --mount=source=/assets,target=/project/assets ls -lR && exit 1       0.5s
------
 > [4/4] RUN --mount=source=/assets,target=/project/assets ls -lR && exit 1:
#10 0.353 .:
#10 0.353 total 12
#10 0.353 -rw-r--r--    1 root     root           137 Jan 13 12:21 Dockerfile
#10 0.353 drwxr-xr-x    2 root     root          4096 Jan 13 12:20 assets
#10 0.353 drwxr-xr-x    2 root     root          4096 Jan 13 12:20 src
#10 0.353
#10 0.353 ./assets:
#10 0.353 total 0
#10 0.353 -rw-r--r--    1 root     root             0 Jan 13 12:19 some-file.txt
#10 0.353
#10 0.353 ./src:
#10 0.353 total 0
#10 0.353 -rw-r--r--    1 root     root             0 Jan 13 12:19 some-source-file.txt
------
executor failed running [/bin/sh -c ls -lR && exit 1]: exit code: 1

With a .dockerignore

Create a .dockerignore to exclude the assets directory from COPY:

echo "/assets/" > Dockerfile.dockerignore

Build the image again;

$ DOCKER_BUILDKIT=1 docker build --no-cache .

[+] Building 2.3s (10/10) FINISHED
 => [internal] load build definition from Dockerfile                                     0.2s
 => => transferring dockerfile: 103B                                                     0.0s
 => [internal] load .dockerignore                                                        0.2s
 => => transferring context: 2B                                                          0.0s
 => resolve image config for docker.io/docker/dockerfile:1.2                             1.2s
 => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2f... 0.0s
 => [internal] load metadata for docker.io/library/busybox:latest                        0.0s
 => [internal] load build context                                                        0.1s
 => => transferring context: 157B                                                        0.0s
 => [1/4] FROM docker.io/library/busybox                                                 0.0s
 => CACHED [2/4] WORKDIR /project                                                        0.0s
 => CANCELED [3/4] COPY . .                                                              0.3s
 => ERROR [4/4] RUN --mount=source=/assets,target=/project/assets ls -lR && exit 1       0.0s
------
 > [4/4] RUN --mount=source=/assets,target=/project/assets ls -lR && exit 1:
------
failed to compute cache key: "/assets" not found: not found

What I expected

  • the .dockerignore to exclude the files when using COPY / ADD, but RUN --mount to have access to files in the build-context.
  • a clearer error in case of a failure;
    • "failed to compute cache key" is confusing, and feels like an implementation detail that's not of interest to the end-user
    • "/assets" not found: not found; "not found" is included twice in the error
    • "/assets" not found: not found; "not found" does not mention that the /assets path is excluded

thaJeztah avatar Jan 13 '21 12:01 thaJeztah

/cc @tonistiigi @tiborvass

thaJeztah avatar Jan 13 '21 12:01 thaJeztah

Probably somewhat related; https://github.com/moby/moby/issues/15771 / https://github.com/moby/moby/issues/37333

thaJeztah avatar Jan 13 '21 13:01 thaJeztah

Yes, this is expected. Mounts with type=bind and no from default to build context. Build context is the same as the source for COPY and applies to .dockerignore rules.

tonistiigi avatar Jan 14 '21 05:01 tonistiigi

Build context is the same as the source for COPY and applies to .dockerignore rules.

Yup, I understand, and I was somewhat expecting that to be the case. The devil is in the details there;

For "classic" builder, COPY did not use a session, so the only way to prevent sending unused files/directories to the daemon was to use a .dockerignore. BuildKit uses sessions, which for many situations makes .dockerignore redundant (if your COPY / ADD instructions are specific enough).

Unfortunately, there's still situations where "being specific" is either hard, or "impossible"; in situations where "most" files are needed (whole project, except for some paths). For cases where those paths are never needed, using a .dockerignore works, but in situations where (e.g.) some stages don't need the files, but other stages do, it's difficult.

I was hoping --mount would be "smart" here, and because I explicitly picked a path that's excluded (but not the "root"), that it would use a separate context/session for that, and allow me to access those files. (Thinking if that would be problematic, because that would also mean that the --mount could potentially use a snapshot of the build-context that was created at a different time than the build-context used for the COPY; perhaps I'm over-thinking that).

What would be the best way to address these scenarios?

  • COPY --exclude (or similar); allow excluding file for individual COPY statements? Do we want these scoped for each COPY, or have some notion of "per stage excludes"? Something like;
    FROM foo AS mystage
    EXCLUDE *.foo 
    EXCLUDE --ignore-file=/.dockerignore
    
  • ignore / exclude option for --mount (possibly allow overriding .dockerignore)?
  • support for multiple build-contexts (https://github.com/moby/moby/issues/37129)?
  • other ideas?

thaJeztah avatar Jan 14 '21 11:01 thaJeztah

.dockerignore should be really used like a .gitignore, for ignoring files that are just completely unnecessary for docker tracking, not to make decisions based on target/build configuration. .dockerignore is also not applied to the builds from remote sources (tar/git) which adds to confusion if misused.

So yeah, buildkit ignores the directories that are not used anyway, even without .dockerignore . The rules are the same for the COPY path and for --mount. Internally they are exactly the same thing and that consistency also makes sense for the user.

Having more complicated exclusion filters on COPY or setting default filters in Dockerfile (per stage) is something that can be discussed (likely already an issue).

tonistiigi avatar Jan 14 '21 22:01 tonistiigi

Have just stumbled across this issue, and I would suggest that this behavior is very contraintuitive, as --mount in the context of .gitignore is often used to mount the source code inside the container, including temporary files that you may not want in the final container. This feature of the buildkit breaks this usage.

slmjy avatar Aug 23 '23 19:08 slmjy

another common use case is bind mounting .git for one RUN to determine which git tag is currently being baked (e.g. for python packages using setuptools_scm).

obviously we don't want the entire .git folder in the layer, so we have to add it to .dockerignore because we have to COPY . before the RUN

but then you can't bind mount it anymore, a catch 22 that defeats the purpose of bind mounting imo.

ddelange avatar Sep 28 '24 20:09 ddelange

obviously we don't want the entire .git folder in the layer, so we have to add it to .dockerignore because we have to COPY . before the RUN

@ddelange for that last part, there's a feature being worked on to allow excluding files for a specific COPY through an --exclude option. That option is not yet in the stable dockerfile syntax (only in the labs variant), so requires you to set a syntax-directive in your Dockerfile; for example;

# syntax=docker/dockerfile:1-labs

FROM alpine
WORKDIR /example

# copy everything, except for the `.git` directory
# and files in ".dockerignore"
COPY --exclude=/.git . .

See the documentation here; https://docs.docker.com/reference/dockerfile/#copy---exclude

thaJeztah avatar Sep 30 '24 09:09 thaJeztah

Another use case is caching node_modules and other caches (.pnpm-store, .terraform, etc) using --cache-from while building in multi-stage builds.

Using --mount=type=cache is not suitable as docker layers are not shared between runners in CIs like GitLab.

f15u avatar Oct 15 '24 12:10 f15u

If the concern/design decision stems from security concerns around a Dockerfile being able to mount any directory on the host, then it feels like we should be given some sort of --i-understand-please-mount-any-path-on-my-host flag, because I control the Dockerfile and know it's not malicious.

magnus-bakke avatar Feb 26 '25 04:02 magnus-bakke

If you want to pass another directory (even one ignored by .dockerignore) to the build you can do it with --build-context flag. Every build context has its own .dockerignore. But if you are passing a dir to docker build that has a .dockerignore file inside then builder will comply with the rules in that file.

 # ls -a
.		..		.dockerignore	Dockerfile	abc		foo
 # cat .dockerignore
foo
 # cat Dockerfile
from alpine
copy . /dir1
copy --from=other . /dir2
run ls -l /dir1 && ls -l /dir2 && stop
 #
 # ls -a foo
.		..		.dockerignore	a1		a2		a3
 # cat foo/.dockerignore
a2
 #
 #
 # docker buildx build --build-context other=foo .
....
 > [stage-0 4/4] RUN ls -l /dir1 && ls -l /dir2 && stop:
0.095 total 4
0.095 -rw-r--r--    1 root     root            90 Feb 26 05:27 Dockerfile
0.095 -rw-r--r--    1 root     root             0 Feb 26 05:25 abc
0.096 total 0
0.096 -rw-r--r--    1 root     root             0 Feb 26 05:25 a1
0.096 -rw-r--r--    1 root     root             0 Feb 26 05:25 a3

tonistiigi avatar Feb 26 '25 05:02 tonistiigi