glide icon indicating copy to clipboard operation
glide copied to clipboard

✨ 🔧 [Telemetry] Add metrics to measure health, latency, request rate to model providers

Open roma-glushko opened this issue 1 year ago • 1 comments

Measure these metrics for all model providers configured:

  • the number of successful requests (counter)
  • the number of failed requests (counter)
  • the response latency (non-streaming lang chat requests)
  • request rate to each provider
  • the first chunk latency (streaming lang chat requests)

roma-glushko avatar Apr 27 '24 22:04 roma-glushko

request rate to each provider

rate is a derived metric, it is computed by the time series store. Having total_requests counter is enough, you can compute rate of requests from it.

gernest avatar May 10 '24 14:05 gernest