serving icon indicating copy to clipboard operation
serving copied to clipboard

Missing metrics or equivalents

Open eloo-abi opened this issue 1 month ago • 8 comments

Since the update to Otel for metics it looks like the metrics and labels are not only renamed. I can also not find any equivalent metric anymore..

So i would like to ask if there is any documentation what metrics are removed or renamed?

Or is there any plan to get in parity again with the old metrics?

e.g. i am looking for an replacement for activator_request_latencies_bucket This could be used in the past to get the speed of the functions as well as detect cold starts with the percentiles..

At the moment i can only see kn_workqueue_queue_duration_seconds_bucket but this is lacking the information for the revision and configuration of an serving.. So its mostly not usage..

Thanks

eloo-abi avatar Nov 03 '25 16:11 eloo-abi

Hey - we've adopted otelhttp in the activator. So you should can get latency metrics that library outputs.

Those follow semantic conventions: https://opentelemetry.io/docs/specs/semconv/http/http-metrics/

If exporting to prometheus that would be eg. http_server_request_duration_seconds_bucket{job="activator-service"}

dprotaso avatar Nov 03 '25 22:11 dprotaso

ah okay.. i have also checked the document here: https://docs.google.com/document/d/1QQ_ubc0RjeZbRHdN4rQR85Z7RZfTSjz4GoKsE0dZ2Z0/edit?pli=1&tab=t.0

but it super hard to follow and map against the old metrics (i am migrating and dashboard to new metrics)

what i have seen so far how we can easily find related metric is to search for "kn_" labels like this:

kn_configuration_name

maybe this could be added to the docs so its easier to understand and migrate

eloo-abi avatar Nov 04 '25 07:11 eloo-abi

I'll need to update the website. I added attributes for various metrics but have missed some.

dprotaso avatar Nov 04 '25 14:11 dprotaso

Actually i have it at the top level section - eg. see autoscaler

https://knative.dev/docs/serving/observability/metrics/serving-metrics/#autoscaler

dprotaso avatar Nov 04 '25 14:11 dprotaso

But we're missing it for the otelhttp ones - let me make a docs issue for that

dprotaso avatar Nov 04 '25 14:11 dprotaso

Note - it was brought up to me that there was a drift in the metric names from the original design document.

I'm going to changes the names to match and cherry pick these back to 1.19/1.20 release. See: https://github.com/knative/serving/pull/16290

In the newer 1.19/1.20 cherry picks we will change

  • kn.queueproxy.app.duration -> kn.serving.invocation.duration
  • kn.queueproxy.depth -> kn.serving.queue.depth

dprotaso avatar Dec 03 '25 19:12 dprotaso

@eloo-abi see this docs PR https://github.com/knative/docs/pull/6533

Let me know if you you need more clarifying information. Otherwise I'll close out this issue when it merges and I cherry pick it back.

dprotaso avatar Dec 03 '25 23:12 dprotaso

@dprotaso yes your changes are looking good. also the naming is more understandable.

thanks

eloo-abi avatar Dec 04 '25 08:12 eloo-abi