opentelemetry-collector
opentelemetry-collector copied to clipboard
Collector cannot export metrics telemetry in an ipv6-only environment
Describe the bug
- Expose telemetry metrics for OpenTelemetry Collector with a correctly escaped ipv6 ip address
- Collector unescapes the ip address and naively concatenates it with the port number
- Too many colons error
Steps to reproduce
- Delimit the ipv6 address with square brackets:
service:
telemetry:
logs:
encoding: json
metrics:
address: '[${env:MY_POD_IP}]:8888'
- Deploy config to an ipv6-only environment
listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address
What did you expect to see? Metrics on port 8888
What did you see instead?
{
"level": "error",
"ts": 1713554862.2179377,
"caller": "[email protected]/collector.go:275",
"msg": "Asynchronous error received, terminating process",
"error": "listen tcp: address dead:beef:dead:beef:dead::beef:8888: too many colons in address",
"stacktrace": "
go.opentelemetry.io/collector/otelcol.(*Collector).Run
go.opentelemetry.io/collector/[email protected]/collector.go:275
go.opentelemetry.io/collector/otelcol.NewCommand.func1
go.opentelemetry.io/collector/[email protected]/command.go:35
github.com/spf13/cobra.(*Command).execute
github.com/spf13/[email protected]/command.go:983
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/[email protected]/command.go:1115
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/[email protected]/command.go:1039
main.runInteractive
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:27
main.run
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main_others.go:10
main.main
github.com/open-telemetry/opentelemetry-collector-releases/contrib/main.go:20
runtime.main
runtime/proc.go:271"
}
What version did you use? v0.98.0
What config did you use?
service:
telemetry:
logs:
encoding: json
metrics:
address: '[${env:MY_POD_IP}]:8888'
Environment helm.sh/chart: opentelemetry-collector-0.87.2 Image: opentelemetry-collector-contrib:0.98.0 Kubernetes: v1.29.1-eks-b9c9ed7
Additional context This is a regression. v0.79.0 did not have this issue
@lpetrazickisupgrade I am curious if the issue is with the collector serving the metrics or the prometheus receiver scrapping. Can you reproduce the issue without a prometheus receiver trying to scrape?
Most likely though this is a bug from switching to using the OTel Go SDK instead of opencensus.
/cc @codeboten
@TylerHelmuth Thanks for taking a look! I think the OpenTelemetry Collector process is crashing at startup parsing the config. The pod is in a CrashLoopBackOff. It doesn't get far enough in the startup sequence to respond to network requests. I've included the only log message
I think the regression may have been introduced by this PR: https://github.com/open-telemetry/opentelemetry-collector/pull/9632/files
Because the otlp exporter reuses the grpc client config: https://github.com/open-telemetry/opentelemetry-collector/blame/v0.98.0/exporter/otlpexporter/config.go#L25