metac icon indicating copy to clipboard operation
metac copied to clipboard

Expose additional metrics in prometheus

Open grzesuav opened this issue 4 years ago • 1 comments

For example:

  • number of resync failures

grzesuav avatar Mar 19 '20 21:03 grzesuav

:wave: number of resync failures is a very specific and yet useful metric one could have to create an alert on in case there are resync errors.

A more generic approach would be something like the Observability RED method (Requests, Errors, Durations)

  • Requests per second
  • Errors per second
  • Durations per second

I'm still getting started with metacontroller but you can have many different controllers and it would be great to have such generic metrics for general observability and troubleshooting purposes.

I could create an alert on errors per second or percentile95 of durations so that I take a look in case any controller is failing or taking too long for whatever reason. That would be a starting point.

jesusvazquez avatar Mar 21 '20 18:03 jesusvazquez