buildx
buildx copied to clipboard
New docker-container builders fail first bake using local cache
Behavior
Using a fresh buildx
docker-container builder, a bake
using a (populated) local cache and a build context (i.e. COPY
, RUN --mount
, etc.) will fail with one of ERROR: failed to solve: Canceled: grpc: the client connection is closing
or ERROR: failed to solve: Unavailable: error reading from server: EOF
Desired behavior
The first build with a fresh builder must succeed against a local cache for practical use of the local cache in CI applications. With a builder that has already bake
d the images, this issue become intermittent. That case should also succeed consistently.
Environment
docker info
:
Client:
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc., v0.9.1)
compose: Docker Compose (Docker Inc., v2.10.2)
extension: Manages Docker extensions (Docker Inc., v0.2.9)
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc., 0.6.0)
scan: Docker Scan (Docker Inc., v0.19.0)
Server:
Containers: 13
Running: 4
Paused: 0
Stopped: 9
Images: 59
Server Version: 20.10.17
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9cd3357b7fd7218e4aec3eae239db1f68a5a6ec6
runc version: v1.1.4-0-g5fd4c4d
init version: de40ad0
Security Options:
seccomp
Profile: default
cgroupns
Kernel Version: 5.10.124-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: aarch64
CPUs: 5
Total Memory: 7.667GiB
Name: docker-desktop
ID: P2BC:5HXV:5ELQ:YK6I:LRNJ:PVRL:FJ76:EZ7P:H2QB:QVXD:ON2C:AUVO
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5000
127.0.0.0/8
Live Restore Enabled: false
Steps to reproduce
Prepare the files (unzip this to skip):
$ mkdir base
$ mkdir layer
$ touch base/Dockerfile
$ touch base/file
$ touch layer/Dockerfile
$ touch images.json
base/Dockerfile
:
FROM ubuntu as base
RUN sleep 2
COPY file file
layer/Dockerfile
:
FROM base_target as layer
RUN sleep 5
images.json
:
{
"target": {
"common": {
"platforms": [
"linux/amd64"
]
},
"base": {
"context": "base",
"cache-from": [
"type=local,src=../cache/base"
],
"cache-to": [
"type=local,mode=max,dest=../cache/base"
],
"inherits": ["common"],
"tags": [
"base"
]
},
"layer": {
"context": "layer",
"cache-from": [
"type=local,src=../cache/layer"
],
"cache-to": [
"type=local,mode=max,dest=../cache/layer"
],
"contexts": {
"base_target": "target:base"
},
"inherits": ["common"],
"tags": [
"layer"
]
}
}
}
Create the builder:
docker buildx create --name container_driver_builder --driver docker-container
Populate the cache:
docker buildx bake --builder container_driver_builder -f images.json layer
For each subsequent test, remove the builder, recreate it, and rebuild the bake
targets:
docker buildx rm container_driver_builder \
&& docker buildx create --name container_driver_builder --driver docker-container \
&& docker buildx bake --builder container_driver_builder -f images.json layer
Each such test fails with ERROR: failed to solve: Canceled: grpc: the client connection is closing
or ERROR: failed to solve: Unavailable: error reading from server: EOF
.
I got a stacktrace to better understand this issue
2022/09/21 02:16:41 CalcSlowCache Canceled: grpc: the client connection is closing: unknown
1 v0.10.0-583-g3fab38923.m buildkitd --debug
github.com/moby/buildkit/session/content.(*callerContentStore).ReaderAt
/src/session/content/caller.go:81
github.com/moby/buildkit/util/contentutil.(*MultiProvider).ReaderAt
/src/util/contentutil/multiprovider.go:78
github.com/moby/buildkit/util/pull/pullprogress.(*ProviderWithProgress).ReaderAt
/src/util/pull/pullprogress/progress.go:28
github.com/moby/buildkit/util/contentutil.(*localFetcher).Fetch
/src/util/contentutil/copy.go:29
github.com/moby/buildkit/util/resolver/limited.(*fetcher).Fetch
/src/util/resolver/limited/group.go:113
github.com/containerd/containerd/remotes.fetch
/src/vendor/github.com/containerd/containerd/remotes/handlers.go:141
github.com/containerd/containerd/remotes.FetchHandler.func1
/src/vendor/github.com/containerd/containerd/remotes/handlers.go:103
github.com/moby/buildkit/util/resolver/retryhandler.New.func1
/src/util/resolver/retryhandler/retry.go:25
github.com/moby/buildkit/util/contentutil.Copy
/src/util/contentutil/copy.go:18
github.com/moby/buildkit/cache.lazyRefProvider.Unlazy.func1
/src/cache/remote.go:335
github.com/moby/buildkit/util/flightcontrol.(*call).run
/src/util/flightcontrol/flightcontrol.go:121
sync.(*Once).doSlow
/usr/local/go/src/sync/once.go:74
sync.(*Once).Do
/usr/local/go/src/sync/once.go:65
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
time="2022-09-21T02:16:41Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = failed to compute cache key: Canceled: grpc: the client connection is closing: unknown"
failed to compute cache key: Canceled: grpc: the client connection is closing: unknown
1 v0.10.0-583-g3fab38923.m buildkitd --debug
github.com/moby/buildkit/session/content.(*callerContentStore).ReaderAt
/src/session/content/caller.go:81
github.com/moby/buildkit/util/contentutil.(*MultiProvider).ReaderAt
/src/util/contentutil/multiprovider.go:78
github.com/moby/buildkit/util/pull/pullprogress.(*ProviderWithProgress).ReaderAt
/src/util/pull/pullprogress/progress.go:28
github.com/moby/buildkit/util/contentutil.(*localFetcher).Fetch
/src/util/contentutil/copy.go:29
github.com/moby/buildkit/util/resolver/limited.(*fetcher).Fetch
/src/util/resolver/limited/group.go:113
github.com/containerd/containerd/remotes.fetch
/src/vendor/github.com/containerd/containerd/remotes/handlers.go:141
github.com/containerd/containerd/remotes.FetchHandler.func1
/src/vendor/github.com/containerd/containerd/remotes/handlers.go:103
github.com/moby/buildkit/util/resolver/retryhandler.New.func1
/src/util/resolver/retryhandler/retry.go:25
github.com/moby/buildkit/util/contentutil.Copy
/src/util/contentutil/copy.go:18
github.com/moby/buildkit/cache.lazyRefProvider.Unlazy.func1
/src/cache/remote.go:335
github.com/moby/buildkit/util/flightcontrol.(*call).run
/src/util/flightcontrol/flightcontrol.go:121
sync.(*Once).doSlow
/usr/local/go/src/sync/once.go:74
sync.(*Once).Do
/usr/local/go/src/sync/once.go:65
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
1 v0.10.0-583-g3fab38923.m buildkitd --debug
github.com/moby/buildkit/solver.(*edge).createInputRequests.func1.1
/src/solver/edge.go:842
github.com/moby/buildkit/solver/internal/pipe.NewWithFunction.func2
/src/solver/internal/pipe/pipe.go:82
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
1 v0.10.0-583-g3fab38923.m buildkitd --debug
main.unaryInterceptor.func1
/src/cmd/buildkitd/main.go:572
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1
/src/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:25
github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1
/src/vendor/github.com/grpc-ecosystem/go-grpc-middleware/chain.go:34
github.com/moby/buildkit/api/services/control._Control_Solve_Handler
/src/api/services/control/control.pb.go:1718
google.golang.org/grpc.(*Server).processUnaryRPC
/src/vendor/google.golang.org/grpc/server.go:1283
google.golang.org/grpc.(*Server).handleStream
/src/vendor/google.golang.org/grpc/server.go:1620
google.golang.org/grpc.(*Server).serveStreams.func1.2
/src/vendor/google.golang.org/grpc/server.go:922
runtime.goexit
/usr/local/go/src/runtime/asm_arm64.s:1172
So the case is that first builds loads cache but it remains only a lazy ref https://github.com/moby/buildkit/blob/v0.10.4/cache/remote.go#L336 created with provider from session. Then a second build comes in when the first session is already dropped and matches against the previous lazy ref. Then unlazy gets called and fails because of the session is already gone.
@sipsma @ktock
I guess the simplest fix is to try to disable lazy behavior for local cache imports from session because it seems fragile.
More proper fixes would be to make sure lazy ref is not matched if it is a different session or add the current session to the group(not sure if this is quite safe actually).
On bake we might need a fix as well to keep the original session alive until all builds have completed. I'm thinking of the case where a "local source" would need to be pulled in by a subsequent build(not sure how practical). But I think this cache issue could appear by just doing two individual builds with same cache source from two different terminals.
Not sure, if it's related to this issue. But since the upgrade to Docker version 23.0.0 which embeds buildx 0.10.2 as its default builder, some people are encountering issues during building a devcontainer (a feature from VSCode).
The error Fail to build a devcontainer: ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
seems to relate to the same one as shown in the description. Could someone confirm that it's the same issue or a completely different one?
@jaudiger I don't think this is related. Can you show the output of docker buildx ls
? If you have a simple repro with the Dockerfile, the build command and logs that would be handy.
@crazy-max While working on a small repro which can be done with this Dockerfile (Dockerfile) and this command:
docker buildx build --build-arg BUILDKIT_INLINE_CACHE=1 -f ./Dockerfile.txt -t test --target bar ./
[+] Building 1.4s (5/6)
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 158B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/fedora:37 1.3s
=> [foo 1/1] FROM docker.io/library/fedora:37@sha256:3487c98481d1bba7e769cf7bcecd6343c2d383fdd6bed34ec541b6b23ef07664 0.0s
=> CACHED [bar 1/1] RUN echo "From inside" 0.0s
=> preparing layers for inline cache 0.1s
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
I found the culprit. If I remove the option: BUILDKIT_INLINE_CACHE=1
, it builds the image, but without it, I got the error above.
Not sure, if it's related to this issue. But since the upgrade to Docker version 23.0.0 which embeds buildx 0.10.2 as its default builder, some people are encountering issues during building a devcontainer (a feature from VSCode).
The error
Fail to build a devcontainer: ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
seems to relate to the same one as shown in the description. Could someone confirm that it's the same issue or a completely different one?
I can confirm this new error. Happens on gitlab CI/CD with docker:dind service when building docker container. Started happening today, with no changes to any docker/CI or related files.
We're also seeing the same error suddenly appear despite not changing anything CI/CD or Docker related on our end for months. GitLab CI/CD running dind with Docker 20.10.13 with BUILDKIT_INLINE_CACHE=1
. Followed @jaudiger's suggestion above, and removing the BUILDKIT_INLINE_CACHE=1
build arg does seem to fix the issue, but I'm curious why this would suddenly break without us having upgraded or changed anything on our end. Wondering if something changed on the DockerHub side, since that's where all our repos are.
Confirmed same error here.
Same error here, works nice yesterday and today launching devcontainer I got the error: "ERROR: failed to solve: Unavailable: error reading from server: EOF" and "ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF"... I scoured the internet looking for a solution, but so far I haven't found anything that would help me, I even formatted my ubuntu but the error keep
error logs:
on this repo: https://github.com/antonioconselheiro/bater-ponto
launch just when trying to launch devcontainer, the microsoft extension for vscode (using devcontainer open .
)
@jaudiger @bwenzel2 @m-melis @antonioconselheiro Same as https://github.com/moby/buildkit/issues/3576, will be fixed with https://github.com/moby/moby/pull/44920.
Closing this issue since it has been fixed in BuildKit 0.11.2 (https://github.com/moby/buildkit/pull/3493)
@crazy-max has there been a new docker image release with the fix?
@denibertovic For this issue, it's already fixed with BuildKit 0.11.2 which is already released. For https://github.com/moby/moby/pull/44920 it will be on next Moby patch release (23.0.1).
@crazy-max I understand. I'm using the official docker images from docker hub in my CI/CD pipeline. And from what I can tell 23.0.1 was not released yet. The latest one that's pushed seems to be 23.0.0.
This issue completely breaks my ability to build and run any docker devcontainer in VS Code on Ubuntu 22.04. I've been able to block it in the devcontainer.json
using the args: {}
section. It seems to listen to me when I add BUILDKIT_INLINE_CACHE=0
.
This was a new install. Firs time in a while. This will probably seriously disrupt a lot of people who use devcontainers in Docker.
This issue completely breaks my ability to build and run any docker devcontainer in VS Code on Ubuntu 22.04. I've been able to block it in the
devcontainer.json
using theargs: {}
section. It seems to listen to me when I addBUILDKIT_INLINE_CACHE=0
.This was a new install. Firs time in a while. This will probably seriously disrupt a lot of people who use devcontainers in Docker.
Damn where were you when this issue first started and I was editing VSC code LOL
Is this still expected to be a problem? I'm encountering this using:
- Docker for Mac
- Buildkit
- Cache Mounts
- A C++ build using an Ubuntu 22 base
When I do the exact same build on a Linux VM, I don't hit this issue.
@denibertovic For this issue, it's already fixed with BuildKit 0.11.2 which is already released. For moby/moby#44920 it will be on next Moby patch release (23.0.1).
I am still getting the error when the Buildkit in my case is v0.13.2
sudo docker buildx ls
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
default* docker
\_ default \_ default running v0.13.2 linux/amd64, linux/amd64/v2, linux/amd64/v3, linux/amd64/v4, linux/386