buildx
buildx copied to clipboard
docker build --no-cache uses cache anyway :'(
Description
I'm trying to build an image referencing a custom image that references maven. I'm using docker cli on linux Fedora 36
Here is the dockerfile
FROM lanico/whanos-java:latest
WORKDIR /app
COPY . .
WORKDIR /app/app
RUN mvn package
RUN ls -R /app
COPY /app/app/target/app.jar .
CMD ["java" , "-jar", "app.jar"]
the image referenced by this Dockerfile is built using this one.
FROM maven:3.8.5-openjdk-17
I successfully built and push the referenced one by using
docker build --no-cache -t lanico/whanos-java:latest
docker push lanico/whanos-java:latest
But then when I try to build the first one "the referencer" It uses cache anyway
docker build --no-cahce --pull -t lanico/test-java:latest
Output :
$ docker build --no-cache --pull -t lanico/test-java:latest .
[+] Building 1.6s (12/12) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 203B 0.0s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/lanico/whanos-java:latest 1.4s
=> [auth] lanico/whanos-java:pull token for registry-1.docker.io 0.0s
=> [internal] load build context 0.1s
=> => transferring context: 3.81kB 0.0s
=> [1/7] FROM docker.io/lanico/whanos-java:latest@sha256:2ed711842f59534bac58dc1834c678068011a611ac53aefcded028d533c05e4b 0.1s
=> => resolve docker.io/lanico/whanos-java:latest@sha256:2ed711842f59534bac58dc1834c678068011a611ac53aefcded028d533c05e4b 0.1s
=> CACHED [2/7] WORKDIR /app 0.0s
=> CACHED [3/7] COPY . . 0.0s
=> CACHED [4/7] WORKDIR /app/app 0.0s
=> CACHED [5/7] RUN mvn package 0.0s
=> CACHED [6/7] RUN ls -R /app 0.0s
=> ERROR [7/7] COPY /app/app/target/app.jar . 0.0s
------
> [7/7] COPY /app/app/target/app.jar .:
------
Dockerfile:7
--------------------
5 | RUN mvn package
6 | RUN ls -R /app
7 | >>> COPY /app/app/target/app.jar .
8 | CMD ["java" , "-jar", "app.jar"]
--------------------
ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref moby::zceqi6bcku5ggj2hte8gg96i0: "/app/app/target/app.jar": not found
Docker has cached the failing mvn package from a previous build and now, it is not building the target/app.jar :'(
It's not even running my ls command.
I've tried to
docker system prune
docker builder prune --all
still NOT working :(
Reproduce
- docker build --no-cache -t
- See that it uses cache :'(
Expected behavior
docker build --no-cache should not show me CACHED in the output. And should not use CACHE (how does it uses cache when I'm removing it prior to build !)
docker version
Client: Docker Engine - Community
Version: 23.0.1
API version: 1.42
Go version: go1.19.5
Git commit: a5ee5b1
Built: Thu Feb 9 19:50:04 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 23.0.1
API version: 1.42 (minimum version 1.12)
Go version: go1.19.5
Git commit: bc3805a
Built: Thu Feb 9 19:47:02 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.14
GitCommit: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc:
Version: 1.1.4
GitCommit: v1.1.4-0-g5fd4c4d
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client:
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.10.2
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.16.0
Path: /usr/libexec/docker/cli-plugins/docker-compose
scan: Docker Scan (Docker Inc.)
Version: v0.23.0
Path: /usr/libexec/docker/cli-plugins/docker-scan
Server:
Containers: 1
Running: 1
Paused: 0
Stopped: 0
Images: 75
Server Version: 23.0.1
Storage Driver: btrfs
Btrfs:
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9ba4b250366a5ddde94bb7c9d1def331423aa323
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.1.11-100.fc36.x86_64
Operating System: Fedora Linux 36 (Workstation Edition)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.643GiB
Name: fedora
ID: QFM4:EZPH:JWCQ:ZUQ5:AMV4:4FOL:OSD5:V35N:EHZM:FBLX:QP5M:WBEI
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: lanico
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Additional Info
Linux kernel version : Linux fedora 6.1.11-100.fc36.x86_64
@gummyWalrus what was the solution??
docker build --no-cache is still not working. Did we have a solution for this?
@Praneethvvs do you have more details? I think some of this may be a presentation issue (there's a ticket somewhere with a longer discussion, but couldn't find it directly).
When using --no-cache, BuildKit will skip the cache for certain steps (such as RUN), but for other steps (such as COPY), it may still use the cache after re-verifying the cache. BuildKit validates the checksum of the files used, and if nothing changed, it will use the cache for those steps (as there would be no need to re-do the step).
@thaJeztah I see some layers are still being cached and from your comment above I understand this was supposed to happen. This answers my question "BuildKit validates the checksum of the files used, and if nothing changed, it will use the cache for those steps". I was wondering why my COPY step is still running from cache even after using --no-cache and I actually had to prune all the cache. Thanks for the details.
@Praneethvvs do you have more details? I think some of this may be a presentation issue (there's a ticket somewhere with a longer discussion, but couldn't find it directly).
When using
--no-cache, BuildKit will skip the cache for certain steps (such asRUN), but for other steps (such asCOPY), it may still use the cache after re-verifying the cache. BuildKit validates the checksum of the files used, and if nothing changed, it will use the cache for those steps (as there would be no need to re-do the step).
This seems like a disservice to the --no-cache option. If I am running no-cache I want the entire build ran without caching. That's the whole point is to rebuild from scratch... Can there at least be a "value" option for --no-cache, something like --no-cache all to say run ALL steps as no cache regardless of if it seems pointless or not...
It's removing the ability for using --no-cache to test a completely raw build by having it still use caching for some steps
The best place to request such a feature would be in the BuildKit repository, where that code is maintained (https://github.com/moby/buildkit)
I'm curious though; do you have a specific scenario where forcing to re-create the same files instead of restoring the same files from cache makes a difference? I understand the "presentation" can be somewhat confusing (there was some discussion at some point to change CACHED to e.g. CACHE VERIFIED to show that the cache was validated), but wondering if you have a specific example where it makes a difference for the image / build result.
I'm curious though; do you have a specific scenario where forcing to re-create the same files instead of restoring the same files from cache makes a difference?
I'll chime in here - I just started experiencing my builds failing on a project last week due to this caching. I'm not exactly sure what changed, but docker suddenly started aggressively caching files it didn't actually have on the docker image. Several COPY commands fail with the following message:
COPY ./wait-for.sh ./
ERROR: failed to calculate checksum of ref moby::3upz0pvfkv28cwdwmr2klalwz: "/wait-for.sh": not found
Even though the above file was clearly in the repo and not ignored. Stranger things still in this use case: If you use wildcards to copy files, some files and folders started being skipped over in a seemingly random fashion, even though the COPY would report that it was successful (and CACHED). Attempting to re-run previously successful runs now fail as well.
So far I have yet to find a workaround, purging and --no-cache didn't help at all. This is on Github specifically, things work fine locally.
This is on Github specifically, things work fine locally.
That looks more like either a bug, or an issue with the nodes; if you have more details on that, please open a ticket in https://github.com/moby/buildkit with details; the issue may depend on what version of docker is installed (e.g., there have been issues with recent distro-packaged versions of docker on ubuntu)
@thaJeztah thanks for the heads-up, I posted the issue here: https://github.com/moby/buildkit/issues/4132
Happy to provide any more info that I can.
Why was this closed?? This issue still exist.
@mcfriend99 read the discussion before posting next time, please.
@thaJeztah Didn't realise. Right post wrong thread.
Same here, no-cache is not working so the COPY command fails because the folder is not there (because old copy cached)
@faq885 same answer; https://github.com/docker/buildx/issues/2387
That looks more like either a bug, or an issue with the nodes; if you have more details on that, please open a ticket in https://github.com/moby/buildkit with details; the issue may depend on what version of docker is installed (e.g., there have been issues with recent distro-packaged versions of docker on ubuntu)
+1 commands continue to use cache for my RUN commands and fails at COPY command as it was dependent on the RUN command - generates the file needed to copy ->
ERROR: failed to solve: failed to compute cache key: failed to calculate checksum of ref moby::g8j6icchxv81cf7fu1yodfl1p: "/promtool": not found
Ahh solved for me , it was due to the fact that - the COPY command it was failing at, the file it was actually trying to COPY was not present in the working directory ...
example ->
COPY /src/some_file /container-dir/some-file
There was no file at location -> /src/some_file.....this somehow caused the docker build to use the cached results for other instructions....
For what it's worth @Sairav seems to be correct, at least for me. --no-cache kept being ignored (judging by the CACHED labels in the output) until I commented out a broken COPY directive. This is.. not very intuitive.
I'm finding that even RUN statements are showing as CACHED.
jfoster@JGF-MBP-14-2022 flutter % docker --version
Docker version 24.0.7, build afdd53b
jfoster@JGF-MBP-14-2022 flutter % docker build --no-cache -t theia-flutter .
[+] Building 0.1s (17/18) docker:desktop-linux
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 1.63kB 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/theia-common:latest 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 2B 0.0s
=> CACHED [ 1/14] FROM docker.io/library/theia-common 0.0s
=> [ 2/14] RUN echo "Hello, world!" > /tmp/hello.txt 0.1s
=> CACHED [ 3/14] RUN apt update && apt install -y libglu1-mesa 0.0s
=> CACHED [ 4/14] WORKDIR /opt 0.0s
=> CACHED [ 5/14] RUN wget https://foo.com/bar.txt 0.0s
=> CACHED [ 6/14] RUN wget https://storage.googleapis.com/flutter_infra_release/releases/stable/li 0.0s
=> CACHED [ 7/14] RUN tar xf flutter* && rm flutter*.tar.xz 0.0s
=> CACHED [ 8/14] RUN ln -s /opt/flutter/bin/dart /usr/local/bin/dart && ln -s /opt/flutter/bi 0.0s
=> CACHED [ 9/14] RUN chmod -R o+w /opt/flutter 0.0s
=> CACHED [10/14] RUN wget https://github.com/Dart-Code/Dart-Code/releases/download/v3.80.0/dart-c 0.0s
=> CACHED [11/14] RUN wget https://github.com/Dart-Code/Flutter/releases/download/v3.80.0/flutter- 0.0s
=> ERROR [12/14] ADD dart-code-3.80.0.vsix /opt/theia/plugins/ 0.0s
=> ERROR [13/14] ADD flutter-3.80.0.vsix /opt/theia/plugins/ 0.0s
Editing the foo.com/bar.txt line still shows cached!
@jgfoster same answer; https://github.com/docker/buildx/issues/2387
please open a ticket in https://github.com/moby/buildkit with details
The issue is still there. The first related issue in buildkit has been renamed and is not addressing this issue anymore. The second related issue is not related to this issue either.
So, for now, the --no-cache flag is not working as intended, and the only way to work around it is to clear images and layers locally to really have a no-cache behavior.
@Karreg same answer; https://github.com/docker/buildx/issues/2387. commenting here won't help. If you have steps to reproduce and suspect there's a bug, please open a ticket in the BuildKit issue tracker instead.
Of course. This message was more for people that keep arriving here, because this is where you end up while searching for this issue, and give them an update on how to workaround this issue until it's somewhat fixed, and then people won't have to come here afterwards...
@Karreg same answer; #4041 (comment). commenting here won't help. If you have steps to reproduce and suspect there's a bug, please open a ticket in the BuildKit issue tracker instead.
But commenting here does help. I'm one of the ones who was brought here by searching for the exact same problem.
docker image prune -a still didn't clear the cache on the layers I'm trying to force to build.
Quick update: add a command to the image to force it to rebuild past that point, something like
RUN ls -lah
I'm having the same issue, using --no-cache still builds "cached" or old versions of my docker project...
In my case it was COPY as well --- I accidentally said COPY when I should have said RUN cp. COPY goes from build context to container, and RUN cp goes from container into another place in the same container.
In the original post, I hypothesize that the problem will be fixed by this change:
-COPY /app/app/target/app.jar .
+RUN cp /app/app/target/app.jar .
I'm just trying to do a RUN ls -la . or RUN echo $(ls -la .) but it keeps getting cached no matter what I do, even when I rearrange the statement and so the command index changes, it's still somehow CACHED the value on the first run.
Anyone have a workaround for just doing a simple ls?
My full command:
docker build --no-cache --progress=plain --target ci -t exampleapp:targetLabel .
The CACHED output:
#6 [node 6/6] RUN echo "$(ls -la .)"
#6 CACHED
#7 [ci 1/5] RUN echo $(ls -la .)
#7 CACHED
Tested in: Docker version 24.0.7, build afdd53b And Docker version 26.1.1, build 4cf5afa
What were the steps before the ls -la .? Did the filesystem change?
How did this even break and how has a regression been open for nearly a year?
I have a command:
FROM public.ecr.aws/docker/library/php:7.2.34-fpm-alpine AS deps
RUN echo "$(ls -lia '/app')"
RUN echo "$(ls -lia ${PWD})"
it will run the first RUN once, then the second RUN will be cached even if it's the first time it runs with that expression:
#19 [deps 2/18] RUN echo "$(ls -lia '/app')" && echo "$(date)"
#19 0.200 ls: /app: No such file or directory
#19 0.200
#19 0.200 Wed Jul 17 14:40:40 UTC 2024
#19 DONE 0.2s
#8 [deps 5/18] RUN echo "$(ls -lia ${PWD})" && echo "$(date)"
#8 CACHED
#9 [deps 4/18] RUN echo "$(ls -lia '/app')" && echo "$(date)"
#9 CACHED
#10 [deps 3/18] RUN echo "$(ls -lia ${PWD})" && echo "$(date)"
#10 CACHED
even if I modify the commands and add more junk to the RUN command for the first time it will interpret it as having been completed and only do the first RUN command in a series:
#8 [deps 3/18] RUN echo "$(ls -lia ${PWD})" && echo "$(date)" && echo "Clown"
#8 CACHED
#9 [deps 5/18] RUN echo "$(ls -lia ${PWD})" && echo "$(date)"
#9 CACHED
#10 [deps 4/18] RUN echo "$(ls -lia '/app')" && echo "$(date)"
#10 CACHED
#19 [deps 2/18] RUN echo "$(ls -lia '/app')" && echo "$(date)" && echo "Clown"
#19 0.239 ls: /app: No such file or directory
#19 0.239
#19 0.240 Wed Jul 17 14:42:08 UTC 2024
#19 0.240 Clown
#19 DONE 0.3s
This defies reason, and is exactly the reason why people want no cache. For anyone else, if you're building like:
docker build --no-cache --progress=plain -t php php/ --target deps
do not use --no-cache-filter with --progress=plain it somehow unsets it, and --no-cache cannot be used with --no-cache-filter
understanding what is happening at each layer is becoming increasingly difficult if I only get the same result the first or third time I run the command on a machine and not this weird intermediary state where the most benign of things can be considered cached.
Just give me none of the cache or layers you have. Wait, wait. I'm worried what you just heard was, "Give me a few cache or layers." What I said was, "Give me none of the cache or layers you have." Do you understand?