OTEL variables intended for bake build delays build
Contributing guidelines
- [x] I've read the contributing guidelines and wholeheartedly agree
I've found a bug and checked that ...
- [x] ... the documentation does not mention anything about my problem
- [ ] ... there are no open or closed issues that are related to my problem
Description
Overriding OTEL-related bake variables (intended for building an image) can introduce artificially long build delays depending on the variable values.
(Note: though a real issue, this is not something I encountered in real usage; this is mainly a companion to compose issue https://github.com/docker/compose/issues/13157, which is something I encountered.)
Expected behaviour
For a bake variable used only in the building of an image, I would expect the value to influence build time to the extent that it impacts the build cache. Even more specifically, I would expect a fully-cached build to complete in sub-second time.
Actual behaviour
Depending on the variable name and value, a delay of ten seconds (or more) can occur despite being fully cached.
Buildx version
github.com/docker/buildx v0.26.1 1a8287f
Docker info
Client: Docker Engine - Community
Version: 28.3.3
Context: default
Debug Mode: false
Plugins:
ai: Docker AI Agent - Ask Gordon (Docker Inc.)
Version: v1.9.3
Path: /home/robertovillarreal/.docker/cli-plugins/docker-ai
buildx: Docker Buildx (Docker Inc.)
Version: v0.26.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.39.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
model: Docker Model Runner (EXPERIMENTAL) (Docker Inc.)
Version: v0.1.36
Path: /usr/libexec/docker/cli-plugins/docker-model
scan: Docker Scan (Docker Inc.)
Version: v0.23.0
Path: /usr/libexec/docker/cli-plugins/docker-scan
Server:
Containers: 11
Running: 4
Paused: 0
Stopped: 7
Images: 97
Server Version: 28.3.3
Storage Driver: overlayfs
driver-type: io.containerd.snapshotter.v1
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
CDI spec directories:
/etc/cdi
/var/run/cdi
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc sysbox-runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 05044ec0a9a75232cad458027ca83437aae3f4da
runc version: v1.2.5-0-g59923ef
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.8.0-60-generic
Operating System: Ubuntu 24.04.3 LTS
OSType: linux
Architecture: x86_64
CPUs: 16
Total Memory: 30.56GiB
Name: l-9jylpn3
ID: B6K2:2BOW:BSIE:WIGE:RODV:GC2B:JMYF:6XP4:25AT:3S4Q:6634:3OII
Docker Root Dir: /var/lib/docker
Debug Mode: true
File Descriptors: 67
Goroutines: 127
System Time: 2025-08-25T20:52:31.9142915-06:00
EventsListeners: 1
Experimental: true
Insecure Registries:
<snip>
192.168.1.0/24
::1/128
127.0.0.0/8
Registry Mirrors:
http://localhost:5005/
http://localhost:5006/
http://localhost:5007/
Live Restore Enabled: false
Default Address Pools:
Base: 172.25.0.0/16, Size: 24
Builders list
NAME/NODE DRIVER/ENDPOINT STATUS BUILDKIT PLATFORMS
buildkit-dev docker-container
\_ buildkit-dev0 \_ unix:///var/run/docker.sock inactive
jd docker-container
\_ jd0 \_ unix:///var/run/docker.sock running v0.22.0 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (7 more)
<snip inactive but sensitive entries>
temp docker-container
\_ temp0 \_ unix:///var/run/docker.sock inactive
default* docker
\_ default \_ default running v0.23.2 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (6 more)
teamx docker
\_ teamx \_ teamx running v0.23.2 linux/amd64 (+4), linux/arm64, linux/arm (+2), linux/ppc64le, (6 more)
worker docker
\_ worker \_ worker running v0.23.2 linux/amd64 (+2), linux/arm64, linux/arm (+2), linux/ppc64le, (5 more)
Configuration
Bake file:
variable "OTEL_TRACES_EXPORTER" {
type = string
default = "none"
}
target "default" {
dockerfile-inline = <<-EOT
FROM busybox
ARG OTEL_TRACES_EXPORTER
RUN echo "using $OTEL_TRACES_EXPORTER"
EOT
args = {
OTEL_TRACES_EXPORTER = OTEL_TRACES_EXPORTER
}
}
Very fast, as expected:
$ date && docker buildx bake && date
Mon Aug 25 09:00:30 PM MDT 2025
[+] Building 0.2s (7/7) FINISHED docker:default
<snip>
Mon Aug 25 09:00:30 PM MDT 2025
Always take ten seconds (note the builder says it was .2 seconds as above, as opposed to wall time):
$ date; OTEL_TRACES_EXPORTER=otlp docker buildx bake default; date
Mon Aug 25 09:04:19 PM MDT 2025
[+] Building 0.2s (7/7) FINISHED docker:default
<snip>
Mon Aug 25 09:04:29 PM MDT 2025
To help illustrate the lag:
$ date; OTEL_TRACES_EXPORTER=otlp docker buildx bake default --progress rawjson; date
Mon Aug 25 09:06:09 PM MDT 2025
{"vertexes":[{"digest":"sha256:032bddc7348073368c320605544d844c00e2b5f7e6ed7271de7ecf8e6e49821d","name":"[internal] load local bake definitions","started":"2025-08-25T21:06:10.011840056-06:00"}]}
<snip... time between these two are .2 seconds>
{"vertexes":[{"digest":"sha256:cf54b426da55281043924583d6743b1f70151b6a0169f1b1d3ee7de26f96edee","name":"exporting to image","started":"2025-08-26T03:06:10.234018654Z","completed":"2025-08-26T03:06:10.283281877Z"}]}
<ten seconds between last printed vertex and bake execution>
Mon Aug 25 09:06:20 PM MDT 2025
Build logs
Additional info
My reproduction is not something I'd do in reality; it is very common for me to use OTEL environment variables in a Dockerfile, but they are always static values and not something I'd change at build time. But somebody else might. Though I discovered this "on accident" (https://github.com/docker/compose/issues/13157), this is a grey area. Obviously the BUILDX_*, DOCKER_*, etc. environment variables are more-or-less 'protected', but not OTEL_*. In my example, there doesn't appear to be a way for the user to say "I only want to influence my bake file" or "I want to influence buildx telemetry", or worse... "I want to influence both".
I chose OTEL_TRACES_EXPORTER in my reproduction because in the absence of other OTEL variables, it consistently gives a ten second lag. But if my example was OTEL_EXPORTER_OTLP_ENDPOINT, it would be unlikely that one value would be 'correct' for both inclusion in the image as well as buildx telemetry. And there would be no way to provide each (buildx itself, and the image being created) with its own 'correct' value.
Though a fix for this would likely be low priority (on the bake side), I thought maybe your thoughts of potential solutions might help on influence what the compose folks might do. I noticed that #2447 seems somewhat related (esp. the solutions/strategies discussed).
Might be related to https://github.com/moby/buildkit/issues/4616 or at least similar? The symptoms sound the same but the reproduction sounds different. At the same time, they may be related. buildx might be trying to use the tracer and timing out at 10 seconds when it can't reach it. The actual containers may be using the environment variable completely fine because the buildkit instance has access.
Almost certainly related, but slightly different. That one mentions buildkit being intentionally configured with a bad value. In that scenario, a hang obviously isn't desirable, but it's understandable. In my case, buildx is consuming that value for itself (which is my actual bug), whereas my intention was for the value to be a build argument to a Dockerfile.
I had forgotten about it until just now, but I've experienced this same ten second delay (also involving OTEL_TRACES_EXPORTER) in another project, but in this case, the docker daemon itself: https://github.com/earthly/earthly/issues/4066. So maybe your twenty seconds is actually two delays: a ten second delay from the buildkit daemon (analogous to that bug report involving the docker daemon), and then another ten second delay from buildx itself if it inherited that value like you suggested. Just a thought.
So yeah, both are the same as far as bad values causing delays, though my report is that the value was not even intended for bake. (Compose had that same issue as well, but had a relatively straightforward fix since there's a separation between variables for compose itself vs. variables for downstream containers, but bake doesn't have that separation.)
Sorry for the long delay on a response to your comment above.
I think this is likely working as designed. Sending OTEL traces is a feature in buildx. While the desired result is to send the environment variable to the builds, it's still setting an environment variable.
The only way I can think of around this is to add the ability to set variables without using environment variables or to use a different name for the environment variable in your configuration when passing it to bake. Something like:
variable "BUILD_TRACES_EXPORTER" {
}
target "mytarget" {
args = {
OTEL_TRACES_EXPORTER = BUILD_TRACES_EXPORTER
}
}
Then use BUILD_TRACES_EXPORTER as the environment variable for passing that environment variable to your build in bake. It might also be possible for us to add an environment variable specific to buildx to disable the OTEL SDK but that wouldn't be my preferred option.
The only way I can think of around this is to add the ability to set variables without using environment variable
Yeah. It wouldn't have to replace the usage of environment variables, just control precedence... e.g.
# closer to a typical CI scenario and your "working as designed"...
# buildkit consumes 'otlp' for itself, but passed 'console' for the variable itself
$ OTEL_TRACES_EXPORTER=otlp docker buildx bake default --var OTEL_TRACES_EXPORTER=console
# my scenario, where the environment variable is not set at all and there is no OTEL setup, but set for illustration...
# buidkit consumes 'none', and variable set to 'otlp'
$ OTEL_TRACES_EXPORTER=none docker buildx bake default --var OTEL_TRACES_EXPORTER=otlp
# shorter equivalent
$ docker buildx bake default --var OTEL_TRACES_EXPORTER=otlp
or to use a different name for the environment variable in your configuration when passing it to bake
Yup. Pretty straightforward and not terrible... provided you're already aware of the issue/behavior. I guess another work-around would be not declaring the variable at all, and instead doing like
$ docker buildx bake default --set '*.args.OTEL_TRACES_EXPORTER=otlp'
(That would probably be my go-to solution assuming nothing changes.)
It might also be possible for us to add an environment variable specific to buildx to disable the OTEL SDK but that wouldn't be my preferred option.
Agreed.
Adding something like --vars would obviously solve this, but unless that was documented as the preferred method, my guess is that folks would only start using it once they encountered this issue (i.e., already too late). All things considered, the best practical 'solution' might just a quick note somewhere in the docs.
I'm going to switch this to a feature request since it's technically working as designed, but we'll consider whether we're going to add another way to specify variables on the command line outside of environment variables.