http-add-on icon indicating copy to clipboard operation
http-add-on copied to clipboard

Add standards-compatible instrumentation to the interceptor

Open arschles opened this issue 4 years ago • 6 comments

The interceptor already has several debugging/observation endpoints that expose several pieces of useful information to be used primarily for debugging purposes. For example, you can fetch a point-in-time copy of the pending queue or the routing table, all via a simple REST API that you can curl. These endpoints are accessible on an HTTP server that runs on a dedicated port, separate from the main interceptor (proxy) server. The target client for these endpoints is more than likely a human.

Since the interceptor is in the critical path of end-user HTTP requests, there are many more metrics that could be captured and most likely consumed by a machine.

Use-Case

I'd like to have the interceptor run an endpoint on which folks can fetch prometheus metrics that describe the current state of the system. For example, it could serve histograms for response codes, counters for total number of requests or timeouts, gauges for in-flight requests, and so forth.

Specification

  • The interceptor is instrumented with prometheus-compatible metrics
  • The interceptor runs a prometheus endpoint
  • Documentation is added (probably to development.md or a new document) regarding these metrics

Resources for Developers

  • Prometheus getting started guide for Go: https://prometheus.io/docs/guides/go-application/
  • GitHub repo: https://github.com/prometheus/client_golang
  • Relatively simple setup guide: https://gabrieltanner.org/blog/collecting-prometheus-metrics-in-golang

arschles avatar Nov 12 '21 17:11 arschles

Another reasonable approach would be through OpenTelemetry but let's keep that out of scope for now.

tomkerkhove avatar Nov 16 '21 08:11 tomkerkhove

@tomkerkhove I've barely started this work, so now would be the time to make the choice between prom. and OT. I feel like it would be wise to have KEDA and this project both do the same thing, so shall I follow kedacore/keda#2291 and do OpenTelemetry?

arschles avatar Nov 17 '21 17:11 arschles

The proposal has only just been opened and not agreed upon yet, so it's up to you.

I think the current landscape is still using a lot of Prometheus so the investment would definitely not be lost and align with KEDA Core, but if you prefer to wait then that's also fine for me.

tomkerkhove avatar Nov 18 '21 07:11 tomkerkhove

@tomkerkhove ok. I'm planning to continue with #322 at some point, but a bunch of things have come up before it, so when I get back to it I'll check kedacore/keda#2291 to see if the OT decision has made any progress. Otherwise I'll just finish the Prometheus work!

arschles avatar Dec 03 '21 18:12 arschles

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 22 '22 23:03 stale[bot]

I think OpenTelemetry should maybe be the new default. Thoughts?

tomkerkhove avatar May 11 '22 12:05 tomkerkhove