chore(.github/workflows): cache datadog-agent image (POC for general Docker images caching)
What does this PR do?
Caches Docker images, starting with datadog-agent.
It also improves the DRYness of our workflows by using CUE.
Motivation
We are running 12 jobs, one per contribs' set and each supported Go version, that each one pulls up to 18 images over and over, thus causing:
- Slow CI time: each services initialization averages around 2 minutes and 30 seconds.
- Rate limiting: we are hitting Docker Hub, only for the pull request tests of a single PR, 216 times. Any increase on the number of PRs running CI will cause rate limiting, failing our pipelines.
Reviewer's Checklist
- [ ] Changed code has unit tests for its functionality at or near 100% coverage.
- [ ] System-Tests covering this feature have been added and enabled with the va.b.c-dev version tag.
- [ ] There is a benchmark for any new code, or changes to existing code.
- [ ] If this interacts with the agent in a new way, a system test has been added.
- [ ] New code is free of linting errors. You can check this by running
./scripts/lint.shlocally. - [ ] Add an appropriate team label so this PR gets put in the right place for the release notes.
- [ ] Non-trivial go.mod changes, e.g. adding new modules, are reviewed by @DataDog/dd-trace-go-guild.
Unsure? Have a question? Request a review!
⚠️ Warnings
🧪 1 Test failed
TestTracesAgentIntegrationfromgithub.com/DataDog/dd-trace-go/v2/ddtrace/tracer(Datadog)Failed === RUN TestTracesAgentIntegration transport_test.go:92: Error Trace: /home/runner/work/dd-trace-go/dd-trace-go/ddtrace/tracer/transport_test.go:92 Error: Received unexpected error: Post "http://localhost:8126/v0.4/traces": dial tcp [::1]:8126: connect: connection refused Test: TestTracesAgentIntegration --- FAIL: TestTracesAgentIntegration (0.00s) panic: runtime error: invalid memory address or nil pointer dereference [recovered] ...
ℹ️ Info
❄️ No new flaky tests detected
This comment will be updated automatically if new data arrives.🔗 Commit SHA: 93f203b | Docs | Was this helpful? Give us feedback!
Benchmarks
Benchmark execution time: 2025-08-26 11:43:00
Comparing candidate commit 93f203b0ccfafcc371ddb10e73766dc0fe14517b in PR branch dario.castane/ktlo/download-agent-once-run-multiple-times with baseline commit 0441ec41104901fcf192d1c13ca9df5c1636721f in branch dario.castane/ktlo/disable-main-branch-ci.
Found 0 performance improvements and 0 performance regressions! Performance is the same for 24 metrics, 0 unstable metrics.
It is definitely better for the services workflow. But for the pull-request and unit-integration workflows, I'm not sure.
I agree, but I didn't want to take on a full refactor yet. My focus was on services and avoiding duplicating versions around.
I definitely see benefits on using CUE, although it's a bit complex. See what I had to do to achieve the conversion from #Service to #Image to reuse the service definition. Once it's set up, it just works.