Add standards-compatible instrumentation to the interceptor
The interceptor already has several debugging/observation endpoints that expose several pieces of useful information to be used primarily for debugging purposes. For example, you can fetch a point-in-time copy of the pending queue or the routing table, all via a simple REST API that you can curl. These endpoints are accessible on an HTTP server that runs on a dedicated port, separate from the main interceptor (proxy) server. The target client for these endpoints is more than likely a human.
Since the interceptor is in the critical path of end-user HTTP requests, there are many more metrics that could be captured and most likely consumed by a machine.
Use-Case
I'd like to have the interceptor run an endpoint on which folks can fetch prometheus metrics that describe the current state of the system. For example, it could serve histograms for response codes, counters for total number of requests or timeouts, gauges for in-flight requests, and so forth.
Specification
- The interceptor is instrumented with prometheus-compatible metrics
- The interceptor runs a prometheus endpoint
- Documentation is added (probably to
development.mdor a new document) regarding these metrics
Resources for Developers
- Prometheus getting started guide for Go: https://prometheus.io/docs/guides/go-application/
- GitHub repo: https://github.com/prometheus/client_golang
- Relatively simple setup guide: https://gabrieltanner.org/blog/collecting-prometheus-metrics-in-golang
Another reasonable approach would be through OpenTelemetry but let's keep that out of scope for now.
@tomkerkhove I've barely started this work, so now would be the time to make the choice between prom. and OT. I feel like it would be wise to have KEDA and this project both do the same thing, so shall I follow kedacore/keda#2291 and do OpenTelemetry?
The proposal has only just been opened and not agreed upon yet, so it's up to you.
I think the current landscape is still using a lot of Prometheus so the investment would definitely not be lost and align with KEDA Core, but if you prefer to wait then that's also fine for me.
@tomkerkhove ok. I'm planning to continue with #322 at some point, but a bunch of things have come up before it, so when I get back to it I'll check kedacore/keda#2291 to see if the OT decision has made any progress. Otherwise I'll just finish the Prometheus work!
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.
I think OpenTelemetry should maybe be the new default. Thoughts?