Missing metrics or equivalents
Since the update to Otel for metics it looks like the metrics and labels are not only renamed. I can also not find any equivalent metric anymore..
So i would like to ask if there is any documentation what metrics are removed or renamed?
Or is there any plan to get in parity again with the old metrics?
e.g. i am looking for an replacement for activator_request_latencies_bucket
This could be used in the past to get the speed of the functions as well as detect cold starts with the percentiles..
At the moment i can only see kn_workqueue_queue_duration_seconds_bucket but this is lacking the information for the revision and configuration of an serving..
So its mostly not usage..
Thanks
Hey - we've adopted otelhttp in the activator. So you should can get latency metrics that library outputs.
Those follow semantic conventions: https://opentelemetry.io/docs/specs/semconv/http/http-metrics/
If exporting to prometheus that would be eg. http_server_request_duration_seconds_bucket{job="activator-service"}
ah okay.. i have also checked the document here: https://docs.google.com/document/d/1QQ_ubc0RjeZbRHdN4rQR85Z7RZfTSjz4GoKsE0dZ2Z0/edit?pli=1&tab=t.0
but it super hard to follow and map against the old metrics (i am migrating and dashboard to new metrics)
what i have seen so far how we can easily find related metric is to search for "kn_" labels like this:
kn_configuration_name
maybe this could be added to the docs so its easier to understand and migrate
I'll need to update the website. I added attributes for various metrics but have missed some.
Actually i have it at the top level section - eg. see autoscaler
https://knative.dev/docs/serving/observability/metrics/serving-metrics/#autoscaler
But we're missing it for the otelhttp ones - let me make a docs issue for that
Note - it was brought up to me that there was a drift in the metric names from the original design document.
I'm going to changes the names to match and cherry pick these back to 1.19/1.20 release. See: https://github.com/knative/serving/pull/16290
In the newer 1.19/1.20 cherry picks we will change
- kn.queueproxy.app.duration -> kn.serving.invocation.duration
- kn.queueproxy.depth -> kn.serving.queue.depth
@eloo-abi see this docs PR https://github.com/knative/docs/pull/6533
Let me know if you you need more clarifying information. Otherwise I'll close out this issue when it merges and I cherry pick it back.
@dprotaso yes your changes are looking good. also the naming is more understandable.
thanks