opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

[CI] Builder integration tests fail on release prep

Open mx-psi opened this issue 3 years ago • 8 comments
trafficstars

Describe the bug

On PRs that prepare the repo to release a new version, the integration tests for ocb fail, since the default configuration file references a tag that is not yet available on the repository.

Steps to reproduce

Follow the release instructions. For example, see #6075

What did you expect to see?

All tests pass.

What did you see instead?

Integration tests for the builder fail with:

Error: failed to update go.mod: exit status 1. Output: "go: downloading go.opentelemetry.io/collector v0.60.0\ngo.opentelemetry.io/collector/cmd/otelcorecol: reading go.opentelemetry.io/collector/go.mod at revision v0.60.0: unknown revision v0.60.0\n"

See full logs for more information.

Additional context

This happens since #5752 cc @jpeach

mx-psi avatar Sep 14 '22 11:09 mx-psi

This is a tricky one. The problem is that in the release process, the make prepare-release step creates the PR that runs the builder integration tests, but only if those pass does the release manager run make push-tag to actually create the tags that the default builder config depends on.

If we simply hack in a replaces directive in the default builder config (for the purposes of integration tests), that won't work because the builder does its collector build in a temporary directory. Generating otelcorecol avoids this problem by passing the flags to generate code in the repository path and only doing the codegen. If we did the same in the builder tests, it would probably defeat the purpose of those tests.

One thing that I found is that go mod tidy will fix up "@latest" versions in go.mod, so it would be possible to have a builder that always builds the latest (maybe only do this for integration CI). Quick patch looks like this:

diff --git cmd/builder/internal/builder/main.go cmd/builder/internal/builder/main.go
index 6bd5a531..0846680e 100644
--- cmd/builder/internal/builder/main.go
+++ cmd/builder/internal/builder/main.go
@@ -20,6 +20,7 @@ import (
 	"os"
 	"os/exec"
 	"path/filepath"
+	"regexp"
 	"text/template"
 	"time"
 
@@ -45,12 +46,19 @@ func GenerateAndCompile(cfg Config) error {
 	return Compile(cfg)
 }
 
+var matchVersionString = regexp.MustCompile("[0-9]+([.][0-9]+)")
+
 // Generate assembles a new distribution based on the given configuration
 func Generate(cfg Config) error {
 	// create a warning message for non-aligned builder and collector base
 	if cfg.Distribution.OtelColVersion != defaultOtelColVersion {
 		cfg.Logger.Info("You're building a distribution with non-aligned version of the builder. Compilation may fail due to API changes. Please upgrade your builder or API", zap.String("builder-version", defaultOtelColVersion))
 	}
+
+	if ok := matchVersionString.MatchString(cfg.Distribution.OtelColVersion); ok {
+		cfg.Distribution.OtelColVersion = "v" + cfg.Distribution.OtelColVersion
+	}
+
 	// if the file does not exist, try to create it
 	if _, err := os.Stat(cfg.Distribution.OutputPath); os.IsNotExist(err) {
 		if err = os.Mkdir(cfg.Distribution.OutputPath, 0750); err != nil {
diff --git cmd/builder/internal/builder/templates/go.mod.tmpl cmd/builder/internal/builder/templates/go.mod.tmpl
index cf7da635..b232adf1 100644
--- cmd/builder/internal/builder/templates/go.mod.tmpl
+++ cmd/builder/internal/builder/templates/go.mod.tmpl
@@ -17,7 +17,7 @@ require (
 	{{- range .Processors}}
 	{{if .GoMod}}{{.GoMod}}{{end}}
 	{{- end}}
-	go.opentelemetry.io/collector v{{.Distribution.OtelColVersion}}
+	go.opentelemetry.io/collector {{.Distribution.OtelColVersion}}
 )
 
 {{- range .Extensions}}
diff --git cmd/builder/internal/config/default.yaml cmd/builder/internal/config/default.yaml
index bceb31e9..1e6df7e6 100644
--- cmd/builder/internal/config/default.yaml
+++ cmd/builder/internal/config/default.yaml
@@ -2,27 +2,26 @@ dist:
   module: go.opentelemetry.io/collector/cmd/otelcorecol
   name: otelcorecol
   description: Local OpenTelemetry Collector binary, testing only.
-  version: 0.60.0-dev
-  otelcol_version: 0.60.0
+  version: latest
+  otelcol_version: latest
 
 receivers:
   - import: go.opentelemetry.io/collector/receiver/otlpreceiver
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
 exporters:
   - import: go.opentelemetry.io/collector/exporter/loggingexporter
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
   - import: go.opentelemetry.io/collector/exporter/otlpexporter
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
   - import: go.opentelemetry.io/collector/exporter/otlphttpexporter
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
 extensions:
   - import: go.opentelemetry.io/collector/extension/ballastextension
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
   - import: go.opentelemetry.io/collector/extension/zpagesextension
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
 processors:
   - import: go.opentelemetry.io/collector/processor/batchprocessor
-    gomod: go.opentelemetry.io/collector v0.60.0
+    gomod: go.opentelemetry.io/collector latest
   - import: go.opentelemetry.io/collector/processor/memorylimiterprocessor
-    gomod: go.opentelemetry.io/collector v0.60.0
-
+    gomod: go.opentelemetry.io/collector latest

jpeach avatar Sep 15 '22 00:09 jpeach

How did this work for 0.59.0?

jpkrohling avatar Sep 15 '22 13:09 jpkrohling

How did this work for 0.59.0?

#5752 was merged just two days ago, it was not on main when releasing 0.59.0

mx-psi avatar Sep 15 '22 13:09 mx-psi

But we did have integration tests for the builder back then, and the problem would be similar. We just didn't have a default config file.

jpkrohling avatar Sep 15 '22 13:09 jpkrohling

The other configs are using replace or older versions

bogdandrutu avatar Sep 15 '22 14:09 bogdandrutu

I have to take another look, but I believe the integration tests should only use what we had before, while the new default config should be similar in test surface to what we had before.

jpkrohling avatar Sep 15 '22 14:09 jpkrohling

I believe the integration tests should only use what we had before, while the new default config should be similar in test surface to what we had before.

As part of https://github.com/open-telemetry/opentelemetry-collector/pull/5752, I added a new integration test case to verify the default config. Since that PR was always rebased onto the latest main, the release tags were always present upstream and the test always passed. I don't know of a reasonable way to not run this test in the special case of the release process, but that could be an option if people prefer to do that. We could also simply drop that test case.

I do sort of like the idea of the builder being able to always generate a collector using whatever the latest tag is, though the implementation is a bit hacky, and maybe it would eventually run into compatibility issues.

jpeach avatar Sep 15 '22 23:09 jpeach

We could also simply drop that test case.

I would replace that with regular unit tests, ensuring a config file can be generated. Whether a properly generated/crafted config file can be used to boot a collector is covered with the existing integration tests, which used to work during the releases (I believe).

I do sort of like the idea of the builder being able to always generate a collector using whatever the latest tag is

The builder should generate a collector with the same version that was available when the builder was released (which is the same version number right now). The reason is that things might move on the collector side from a version to another, so we can only ensure the builder works for known versions of the collector.

jpkrohling avatar Sep 16 '22 13:09 jpkrohling