vector icon indicating copy to clipboard operation
vector copied to clipboard

Separate expire_metrics_secs timer per metric set

Open johnhtodd opened this issue 6 months ago • 0 comments

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

I have several aggregations that I have built. I flush the metrics into logs every N seconds, but N varies by quite a bit depending on what kind of aggregation it is. The aggregations have high cardinality, and my desire is to transmit only meaningful metrics - those that have activity within N seconds and continue to increment their counters. Metrics that have no activity in one of these aggregation sets (meaning: they are singletons) are not particularly important. Since I have different timers for flushing the metrics to logs, I have to set the global "expire_metrics_secs" value to be higher than the longest aggregation counter across my sets, meaning that for the more-frequently flushed sets where N is lower, I have a lot of chaff events that I would prefer to have deleted since they have no activity.

A solution for this would be to create expiration counters that apply to each metric set, so that metrics with no activity would be deleted from each set independently of the timers on other metric sets.

This is not a critical feature, but it seems like it is a missing concept with use cases that are doing aggregation in different dimensions.

Attempted Solutions

I have not attempted any solution for this other than setting expire_metrics_secs to be the highest of all my aggregation flush counters. I could also put a filter in my metrics_to_logs path that filtered out any events that had a counter of 1 in order to not transmit those onwards, but that isn't precisely what I want since that doesn't take time into account; just quantity.

Proposal

I would propose that the expire_metrics_secs can be defined globally as a default, but that each metric set can have its own timer upon declaration of the metric set.

References

No response

Version

No response

johnhtodd avatar Jan 31 '24 08:01 johnhtodd