opentelemetry-collector icon indicating copy to clipboard operation
opentelemetry-collector copied to clipboard

Collector cannot export metrics telemetry in an ipv6-only environment

Open lpetrazickisupgrade opened this issue 1 year ago • 3 comments

Describe the bug

  1. Expose telemetry metrics for OpenTelemetry Collector with a correctly escaped ipv6 ip address
  2. Collector unescapes the ip address and naively concatenates it with the port number
  3. Too many colons error

Steps to reproduce

  1. Delimit the ipv6 address with square brackets:
service:
  telemetry:
    logs:
      encoding: json
    metrics:
      address: '[${env:MY_POD_IP}]:8888'
  1. Deploy config to an ipv6-only environment
  2. listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address

What did you expect to see? Metrics on port 8888

What did you see instead?

{
  "level": "error",
  "ts": 1713554862.2179377,
  "caller": "[email protected]/collector.go:275",
  "msg": "Asynchronous error received, terminating process",
  "error": "listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address",
  "stacktrace": "
go.opentelemetry.io/collector/otelcol.(*Collector).Run
    go.opentelemetry.io/collector/[email protected]/collector.go:275
go.opentelemetry.io/collector/otelcol.NewCommand.func1
    go.opentelemetry.io/collector/[email protected]/command.go:35
github.com/spf13/cobra.(*Command).execute
    github.com/spf13/[email protected]/command.go:983
github.com/spf13/cobra.(*Command).ExecuteC
    github.com/spf13/[email protected]/command.go:1115
github.com/spf13/cobra.(*Command).Execute
    github.com/spf13/[email protected]/command.go:1039
main.runInteractive
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:27
main.run
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:10
main.main
    github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:20
runtime.main
    runtime/proc.go:271"
}

What version did you use? v0.98.0

What config did you use?

service:
  telemetry:
    logs:
      encoding: json
    metrics:
      address: '[${env:MY_POD_IP}]:8888'

Environment helm.sh/chart: opentelemetry-collector-0.87.2 Image: opentelemetry-collector-contrib:0.98.0 Kubernetes: v1.29.1-eks-b9c9ed7

Additional context This is a regression. v0.79.0 did not have this issue

lpetrazickisupgrade avatar Apr 22 '24 13:04 lpetrazickisupgrade

@lpetrazickisupgrade I am curious if the issue is with the collector serving the metrics or the prometheus receiver scrapping. Can you reproduce the issue without a prometheus receiver trying to scrape?

TylerHelmuth avatar Apr 22 '24 15:04 TylerHelmuth

Most likely though this is a bug from switching to using the OTel Go SDK instead of opencensus.

/cc @codeboten

TylerHelmuth avatar Apr 22 '24 15:04 TylerHelmuth

@TylerHelmuth Thanks for taking a look! I think the OpenTelemetry Collector process is crashing at startup parsing the config. The pod is in a CrashLoopBackOff. It doesn't get far enough in the startup sequence to respond to network requests. I've included the only log message

I think the regression may have been introduced by this PR: https://github.com/open-telemetry/opentelemetry-collector/pull/9632/files

Because the otlp exporter reuses the grpc client config: https://github.com/open-telemetry/opentelemetry-collector/blame/v0.98.0/exporter/otlpexporter/config.go#L25

lpetrazickisupgrade avatar Apr 22 '24 16:04 lpetrazickisupgrade