pyroscope icon indicating copy to clipboard operation
pyroscope copied to clipboard

feat: Optional include label value cardinality in `/LabelNames`

Open bryanhuhta opened this issue 5 months ago • 2 comments

closes https://github.com/grafana/pyroscope/issues/3226

Overview

This PR is a POC change that allows requesting estimated label value cardinality alongside label names. This will allow the UI to make substantially fewer requests in order to populate the the label view (see below).

Screenshot 2024-09-17 at 5 25 26 PM

In order for the UI to render the labels bar, it needs to:

  • Make a /LabelNames request
  • For every name, make a /LabelValues request
  • Sort by cardinality (higher goes first)
  • Render the bar

This is quite expensive in terms of network cost to select the top N labels by cardinality.

Approach

This PR modifies the /LabelNames API to allow a caller to optionally request estimated cardinality:

message LabelNamesRequest {
  repeated string matchers = 1;
  int64 start = 2;
  int64 end = 3;
  optional bool include_cardinality = 4;
}

This would provide the following response:

message LabelNamesResponse {
  repeated string names = 1;
  repeated int64 estimated_cardinality = 2;
}

where names and estimated_cardinality are a 1:1 correspondence (if cardinality is requested). If cardinality is not requested, estimated_cardinality will be empty.

Note that the cardinality returned is an estimate. This PR does not deduplicate merged responses. This is fine, as the UI needs a way to select the top N labels by cardinality to render the label bar. It can make subsequent requests to /LabelValues to find exact counts. This should reduce network calls from 20 or more to ~7.

Performance

I ran three 10 minute load tests, each requesting 24 hours of data for service_name="fire-dev-001/querier.

  1. /LabelNames with no cardinality
  2. /LabelNames with cardinality
  3. /LabelNames with no cardinality followed by N /LabelValues

TODO(bryan): Attach benchmark results

Test # Reqs p99 latency
/LabelNames with no cardinality ? ?
/LabelNames with cardinality ? ?
/LabelNames / /LabelValues ? ?

Explore Profiles view

Results

As expected, this PR puts more pressure on store gateways fulfilling a /LabelNames requests with cardinality as they need to also look up the label value when scanning postings. However, I think this tradeoff is acceptable as it reduces subsequent network calls the store gateways need to handle. So a /LabelNames request that is 2x more expensive alleviates the need for 15 more cheaper requests.

Alternatives

This PR provides one possible approach, which is to reuse the /LabelNames endpoint. An alternative is to create a brand new endpoint /LabelCardinality. This has the advantage of making performance characteristics much more predictable. E.g. a /LabelNames call with cardinality isn't suddenly much more expensive. We could also tune the endpoint to do the "top N" calculation server-side and provide the exact cardinality instead of an estimate.

A drawback, of course, is yet another API endpoint specifically to power a single component in the UI.

bryanhuhta avatar Sep 18 '24 00:09 bryanhuhta