relay icon indicating copy to clipboard operation
relay copied to clipboard

Increase metrics bucketing efficiency

Open jjbayer opened this issue 2 years ago • 3 comments

Since https://github.com/getsentry/relay/issues/2083, we skip metrics extraction for indexed transactions in PoPs, and extract those metrics in processing relays instead.

This deteriorated our bucketing efficiency, because metrics extracted by PoPs are routed to processing relays by bucket key, but the transaction payloads themselves are routed round-robin.

Possible solutions to improve bucketing efficiency again (non-exhaustive):

  1. Route transactions by a subset of its corresponding metrics' bucket key.
  2. Make sure that metrics extracted by PoP-Relays pass through the same routing as buckets emitted by PoPs, by sending them over the network instead of inserting them into the local aggregator directly. 2.a. This could be accomplished by separating the concerns of processing and metrics aggregation into two different relay pools (as we did for project configs). See https://github.com/getsentry/team-ingest/issues/139.
  3. ...

jjbayer avatar Oct 30 '23 13:10 jjbayer

Increase of number of buckets on the transaction metrics topic was +70%

jjbayer avatar Nov 09 '23 14:11 jjbayer

  • Additional routing + aggregation layer brings latency -> don't want that
  • Routing transactions might help, but metrics extracted in PoPs would still go to a different processing relay, and metrics with fewer tags than the common set (e.g. usage) would not benefit from this.

jjbayer avatar Jan 24 '24 12:01 jjbayer

Additional routing + aggregation layer brings latency -> don't want that

We thought this through: It would not actually bring additional latency compared to metrics that are extraced in PoPs. In both cases the time frame is roughly the same (PoP Aggregator -> Processing Aggregator vs Processing Aggregator -> Processing Aggregator).

jjbayer avatar Jan 29 '24 11:01 jjbayer

We should re-visit this if it becomes a problem for either Kafka (+ consumers) or Relay itself.

Dav1dde avatar Dec 11 '24 13:12 Dav1dde