observability
observability copied to clipboard
Internal rpc latency differences
internal_rpc latency is reported differently on the Default and Ops dashboard making it difficult to tell what the true p95 and p99 values are. E.g.
PromQL query for internal_rpc latency p95 on Ops Dashboard:
histogram_quantile(0.95, sum(rate(redpanda_rpc_request_latency_seconds_bucket{instance=~"[[node]]",exported_instance=~"[[exported_node]]",shard=~"[[node_shard]]",redpanda_cloud_data_cluster_name=~"[[data_cluster]]"}[5m])) by (le, [[aggr_criteria]]))
PromQL query for internal_rpc latency p95 on Default Dashboard:
histogram_quantile(0.95, sum(rate(redpanda_rpc_request_latency_seconds_bucket{instance=~"$node",redpanda_server="internal"}[$__rate_interval])) by (le, $aggr_criteria))