buildx bake file precedence bug
Contributing guidelines
- [X] I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [X] ... the documentation does not mention anything about my problem
- [X] ... there are no open or closed issues that are related to my problem
Description
buildx reads and validates compose.yaml files even if it doesn't use the compose.yaml file
Expected behaviour
Repro steps:
- Create a
docker-bake.hclfile like this:
group "default" {
targets = ["webapp"]
}
target "webapp" {
dockerfile = "Dockerfile"
tags = ["docker.io/username/webapp"]
}
- Create an invalid compose.yaml file like this:
services:
- run
docker buildx bake
Expected behavior: I would expect buildx to successfully run the bakefile
Actual behaviour
docker buildx bake
[+] Building 0.0s (0/0) docker:desktop-linux
ERROR: validating : services must be a mapping
Buildx version
latest docker-internal
Docker info
Client: Docker Engine - Community
Version: 24.0.0
Context: desktop-linux
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.0.1-cloud-driver+004
Path: /home/nick/.docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.17.3
Path: /usr/lib/docker/cli-plugins/docker-compose
dev: Docker Dev Environments (Docker Inc.)
Version: v0.1.0
Path: /usr/lib/docker/cli-plugins/docker-dev
extension: Manages Docker extensions (Docker Inc.)
Version: v0.2.19
Path: /usr/lib/docker/cli-plugins/docker-extension
init: Creates Docker-related starter files for your project (Docker Inc.)
Version: v0.1.0-beta.4
Path: /usr/lib/docker/cli-plugins/docker-init
sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
Version: 0.6.0
Path: /usr/lib/docker/cli-plugins/docker-sbom
scan: Docker Scan (Docker Inc.)
Version: v0.26.0
Path: /usr/lib/docker/cli-plugins/docker-scan
scout: Command line tool for Docker Scout (Docker Inc.)
Version: v0.11.0
Path: /usr/lib/docker/cli-plugins/docker-scout
WARNING: Plugin "/usr/lib/docker/cli-plugins/docker-compose.14.backup" is not valid: plugin candidate "compose.14.backup" did not match "^[a-z][a-z0-9]*$"
Server:
Containers: 10
Running: 9
Paused: 0
Stopped: 1
Images: 102
Server Version: 23.0.5
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 3dce8eb055cbb6872793272b4f20ed16117344f8
runc version: v1.1.7-0-g860f061
init version: de40ad0
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.15.49-linuxkit
Operating System: Docker Desktop
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 7.526GiB
Name: docker-desktop
ID: a92cef06-564f-4766-91bd-bc9e839af9fa
Docker Root Dir: /var/lib/docker
Debug Mode: false
HTTP Proxy: http.docker.internal:3128
HTTPS Proxy: http.docker.internal:3128
No Proxy: hubproxy.docker.internal
Experimental: false
Insecure Registries:
hubproxy.docker.internal:5555
127.0.0.0/8
Live Restore Enabled: false
Builders list
n/a
Configuration
see above
Build logs
No response
Additional info
If you fix the compose file, bake doesn't even use it - it builds what's in the bakefile
Side note -- the real bug report is boiled down from a larger repro case.
In the larger repo case, there was a bug in the Compose file parser, but the bug fix hadn't been upstreamed into buildx yet, so buildx was giving a cryptic error on a compose file that it wasn't even using and which the latest version of Compose said was fine.
The configuration file lookup order does seem inconsistent with the documentation. In my case, having a valid docker-compose.yml and a valid docker-bake.hcl and running docker buildx bake it seems that the compose file takes precedence over the hcl one.
Link to the relevant part in the documentation..
My docker version is 23.0.5 and buildx version is v0.10.4.
Going to try to confirm it by looking in a few more places, but it does appear that the ordering for precedence is the exact opposite of what it should be. It looks like someone may have had the intention to have the highest precedence be the last element in the slice since the ordering is set up in that way, but the orderings I can find seem to iterate through the list in a forwards direction.
This may end up meaning that the bake command reads from the .json file before it reads from .hcl.
I think I understand more about why the ordering is backwards. The reason is less an error and more an artifact of how the override system works. When bake is reading in the configuration, it is creating groups and targets that will later be used by the actual command.
The overrides will create or override existing targets so they get read last. In essence, it merges all available targets from all files rather than only loading the first file it finds.
I think there's two options for fixing this:
- Clarify the behavior in the docs regarding the merging functionality.
- Print a warning when a compose file cannot be read due to formatting
- OR continue to fail but with a more clear error message/with the documentation describing why it's happening correctly
- Modify bake so it only reads from the first file that it finds while also keeping the merge functionality for the override files
I'd probably go with 2 but I'd like to hear opinions. It seems like compose uses the second option although it gives some flexibility with the override names.
I caught the same problem. :(
This behavior is intentional and has been clarified in the docs: https://github.com/docker/buildx/commit/f4f511201bbfbcc029466201a14a34fa091f2216
At the same time, bake doesn't really need to load the entire compose file or validate the entire schema. We mostly just need the parts related to building. I'm going to open a new enhancement request for that and close this one since it's now documented behavior and not a bug.