serving Missing latency by HTTP status code for Knative Serving with Kourier

Ask your question here:

I'm trying to monitor request latency for my Knative services broken down by HTTP response code (or response code class like 2xx, 4xx, 5xx), similar to how I can monitor RPS.

The Envoy metrics envoy_cluster_external_upstream_rq_time_sum and envoy_cluster_external_upstream_rq_time_count don't include the envoy_response_code or envoy_response_code_class labels, so I cannot break down latency by response status.

I've checked:

envoy_cluster_external_upstream_rq_time_* - no response code labels
envoy_cluster_upstream_rq_time_* - no response code labels
envoy_http_downstream_rq_time_* - no cluster-specific or response code labels
kn_revision_* metrics - these track autoscaler metrics but not request latency

What is the recommended way to monitor request latency per response code (or response code class) for Knative services?

Environment

Knative Serving version: [your version]
Ingress: Kourier (3scale-kourier-gateway)
Monitoring: Prometheus + Grafana

Nov 03 '25 19:11 gtupak

cc @dsimansk do you know what metrics kourier emits?

Nov 03 '25 22:11 dprotaso

Hi,

I think this is rather an envoy issue as envoy sits between client and server. My understanding is that the net-kourier-controller is unaware of details about the requests and can't know anything them.

I also just checked my envoy metrics (using 1.36) and also the metrics of the net-kourier-controller and couldn't find anything that would help, same as you describe in the issue description.

Not sure if there is anything we can do here

Nov 18 '25 09:11 linkvt

So how do people usually monitor their knative request latencies? What I ended up doing is putting Nginx in front of kourier but this adds an extra hop that I'd like to avoid.

Dec 05 '25 15:12 gtupak

Envoy (as part of kourier) exposes its default metrics, the latency is provided per service as part of the envoy_cluster_upstream_rq_time_bucket histogram. You can use that to measure the latency, but as envoy doesn't include labels for latency per status code you can't get metric. There is a similar issue in the envoy repo but it was closed 5 years ago, you could give that another bump if it's important to you:

Besides that, istios sidecar pods do have a metric istio_request_duration_milliseconds that also contains the status code:

istio_request_duration_milliseconds_bucket{reporter="destination",source_workload="unknown",source_canonical_service="unknown",source_canonical_revision="latest",source_workload_namespace="unknown",source_principal="unknown",source_app="unknown",source_version="unknown",source_cluster="unknown",destination_workload="httpbin",destination_workload_namespace="default",destination_principal="unknown",destination_app="httpbin",destination_version="",destination_service="httpbin.default.svc.cluster.local",destination_canonical_service="httpbin",destination_canonical_revision="latest",destination_service_name="httpbin",destination_service_namespace="default",destination_cluster="Kubernetes",request_protocol="http",response_code="404",grpc_response_status="",response_flags="-",connection_security_policy="none",le="0.5"} 0

You could try using istio with knative, its also supported.

You could also give distributed tracing a look, IIRC you could also filter your requests there by the response code.

Dec 08 '25 09:12 linkvt