pkg icon indicating copy to clipboard operation
pkg copied to clipboard

Figure out how to support more OTel knobs

Open dprotaso opened this issue 3 months ago • 9 comments

TIL: there's an almost stable config that will work across OTel implementation here:

go implementation https://pkg.go.dev/go.opentelemetry.io/contrib/otelconf/v0.3.0 spec: https://github.com/open-telemetry/opentelemetry-specification/issues/4374

We could support additional OTel knobs by allowing operators to specify such a config in config-observability

Then we could merge the existing keys with the keys in said config.

Alternatively, given that config is pretty large (probably was pulled out of the collector) we could simply add more options as separate config map keys as people request them.

dprotaso avatar Sep 11 '25 15:09 dprotaso

Was thinking about this for https://github.com/knative/pkg/issues/3256. But honestly I don't think this should gate that change

cc @evankanderson @Cali0707 for any input.

dprotaso avatar Sep 11 '25 15:09 dprotaso

Also all the knobs can be manipulated with env vars

https://opentelemetry.io/docs/specs/otel/configuration/sdk-environment-variables/ eg. https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp#pkg-overview

dprotaso avatar Sep 11 '25 16:09 dprotaso

One thing I noticed is that auth for certain backends use an API key. To configure OTel clients we would use WithHeaders option to set an Authorization header.

This API key is sensitive so it shouldn't be in the config map but a secret

dprotaso avatar Sep 12 '25 13:09 dprotaso

This API key is sensitive so it shouldn't be in the config map but a secret

Talking to @Cali0707 - I think a pragmatic approach here is to have metrics be sent to a OTel collector in the cluster. Then the collector has all the configuration that's specific to the desired backend.

Some advantages to this is it allows there to be multiple 'processing' pipelines so metrics can be sent to one place and then multiplex to different destinations.

Thus to prioritize which knobs to support I think we should focus on what's people need to send metrics to a collector. For example we can default to emitting cumulative metrics and the collector can change the temporality to 'delta'. Thus we don't need a temporality knob. (ref: https://github.com/knative/pkg/issues/3256)

dprotaso avatar Sep 12 '25 16:09 dprotaso

Thus to prioritize which knobs to support I think we should focus on what's people need to send metrics to a collector.

@dprotaso do we have anywhere we are tracking all the knobs + which ones we want to expose? Maybe in the otel gdoc?

Cali0707 avatar Sep 16 '25 12:09 Cali0707

@Cali0707

do we have anywhere we are tracking all the knobs

No - but there are a ton of knobs. eg. otelhttp knobs are here: https://pkg.go.dev/go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp

We could just recommend folks use env vars as I mentioned here https://github.com/knative/pkg/issues/3257#issuecomment-3281896995

But then we need a way to set env vars on components we deploy - eg. Knative Revisions

dprotaso avatar Sep 16 '25 13:09 dprotaso

But then we need a way to set env vars on components we deploy - eg. Knative Revisions

This would specifically be the queue-proxy, right (since they can already set environment variables on the user containers)?

evankanderson avatar Sep 16 '25 13:09 evankanderson

This would specifically be the queue-proxy, right (since they can already set environment variables on the user containers)?

Yup.

I believe eventing has scenarios where some infrastructure is dynamic based on the CRDs applied.

dprotaso avatar Sep 16 '25 13:09 dprotaso

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Dec 16 '25 01:12 github-actions[bot]