opentelemetry-collector
opentelemetry-collector copied to clipboard
[CI] Builder integration tests fail on release prep
Describe the bug
On PRs that prepare the repo to release a new version, the integration tests for ocb fail, since the default configuration file references a tag that is not yet available on the repository.
Steps to reproduce
Follow the release instructions. For example, see #6075
What did you expect to see?
All tests pass.
What did you see instead?
Integration tests for the builder fail with:
Error: failed to update go.mod: exit status 1. Output: "go: downloading go.opentelemetry.io/collector v0.60.0\ngo.opentelemetry.io/collector/cmd/otelcorecol: reading go.opentelemetry.io/collector/go.mod at revision v0.60.0: unknown revision v0.60.0\n"
See full logs for more information.
Additional context
This happens since #5752 cc @jpeach
This is a tricky one. The problem is that in the release process, the make prepare-release step creates the PR that runs the builder integration tests, but only if those pass does the release manager run make push-tag to actually create the tags that the default builder config depends on.
If we simply hack in a replaces directive in the default builder config (for the purposes of integration tests), that won't work because the builder does its collector build in a temporary directory. Generating otelcorecol avoids this problem by passing the flags to generate code in the repository path and only doing the codegen. If we did the same in the builder tests, it would probably defeat the purpose of those tests.
One thing that I found is that go mod tidy will fix up "@latest" versions in go.mod, so it would be possible to have a builder that always builds the latest (maybe only do this for integration CI). Quick patch looks like this:
diff --git cmd/builder/internal/builder/main.go cmd/builder/internal/builder/main.go
index 6bd5a531..0846680e 100644
--- cmd/builder/internal/builder/main.go
+++ cmd/builder/internal/builder/main.go
@@ -20,6 +20,7 @@ import (
"os"
"os/exec"
"path/filepath"
+ "regexp"
"text/template"
"time"
@@ -45,12 +46,19 @@ func GenerateAndCompile(cfg Config) error {
return Compile(cfg)
}
+var matchVersionString = regexp.MustCompile("[0-9]+([.][0-9]+)")
+
// Generate assembles a new distribution based on the given configuration
func Generate(cfg Config) error {
// create a warning message for non-aligned builder and collector base
if cfg.Distribution.OtelColVersion != defaultOtelColVersion {
cfg.Logger.Info("You're building a distribution with non-aligned version of the builder. Compilation may fail due to API changes. Please upgrade your builder or API", zap.String("builder-version", defaultOtelColVersion))
}
+
+ if ok := matchVersionString.MatchString(cfg.Distribution.OtelColVersion); ok {
+ cfg.Distribution.OtelColVersion = "v" + cfg.Distribution.OtelColVersion
+ }
+
// if the file does not exist, try to create it
if _, err := os.Stat(cfg.Distribution.OutputPath); os.IsNotExist(err) {
if err = os.Mkdir(cfg.Distribution.OutputPath, 0750); err != nil {
diff --git cmd/builder/internal/builder/templates/go.mod.tmpl cmd/builder/internal/builder/templates/go.mod.tmpl
index cf7da635..b232adf1 100644
--- cmd/builder/internal/builder/templates/go.mod.tmpl
+++ cmd/builder/internal/builder/templates/go.mod.tmpl
@@ -17,7 +17,7 @@ require (
{{- range .Processors}}
{{if .GoMod}}{{.GoMod}}{{end}}
{{- end}}
- go.opentelemetry.io/collector v{{.Distribution.OtelColVersion}}
+ go.opentelemetry.io/collector {{.Distribution.OtelColVersion}}
)
{{- range .Extensions}}
diff --git cmd/builder/internal/config/default.yaml cmd/builder/internal/config/default.yaml
index bceb31e9..1e6df7e6 100644
--- cmd/builder/internal/config/default.yaml
+++ cmd/builder/internal/config/default.yaml
@@ -2,27 +2,26 @@ dist:
module: go.opentelemetry.io/collector/cmd/otelcorecol
name: otelcorecol
description: Local OpenTelemetry Collector binary, testing only.
- version: 0.60.0-dev
- otelcol_version: 0.60.0
+ version: latest
+ otelcol_version: latest
receivers:
- import: go.opentelemetry.io/collector/receiver/otlpreceiver
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
exporters:
- import: go.opentelemetry.io/collector/exporter/loggingexporter
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
- import: go.opentelemetry.io/collector/exporter/otlpexporter
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
- import: go.opentelemetry.io/collector/exporter/otlphttpexporter
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
extensions:
- import: go.opentelemetry.io/collector/extension/ballastextension
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
- import: go.opentelemetry.io/collector/extension/zpagesextension
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
processors:
- import: go.opentelemetry.io/collector/processor/batchprocessor
- gomod: go.opentelemetry.io/collector v0.60.0
+ gomod: go.opentelemetry.io/collector latest
- import: go.opentelemetry.io/collector/processor/memorylimiterprocessor
- gomod: go.opentelemetry.io/collector v0.60.0
-
+ gomod: go.opentelemetry.io/collector latest
How did this work for 0.59.0?
How did this work for 0.59.0?
#5752 was merged just two days ago, it was not on main when releasing 0.59.0
But we did have integration tests for the builder back then, and the problem would be similar. We just didn't have a default config file.
The other configs are using replace or older versions
I have to take another look, but I believe the integration tests should only use what we had before, while the new default config should be similar in test surface to what we had before.
I believe the integration tests should only use what we had before, while the new default config should be similar in test surface to what we had before.
As part of https://github.com/open-telemetry/opentelemetry-collector/pull/5752, I added a new integration test case to verify the default config. Since that PR was always rebased onto the latest main, the release tags were always present upstream and the test always passed. I don't know of a reasonable way to not run this test in the special case of the release process, but that could be an option if people prefer to do that. We could also simply drop that test case.
I do sort of like the idea of the builder being able to always generate a collector using whatever the latest tag is, though the implementation is a bit hacky, and maybe it would eventually run into compatibility issues.
We could also simply drop that test case.
I would replace that with regular unit tests, ensuring a config file can be generated. Whether a properly generated/crafted config file can be used to boot a collector is covered with the existing integration tests, which used to work during the releases (I believe).
I do sort of like the idea of the builder being able to always generate a collector using whatever the latest tag is
The builder should generate a collector with the same version that was available when the builder was released (which is the same version number right now). The reason is that things might move on the collector side from a version to another, so we can only ensure the builder works for known versions of the collector.