contour icon indicating copy to clipboard operation
contour copied to clipboard

/healthz endpoint should record statistics

Open davecheney opened this issue 5 years ago • 5 comments

our /healthz endpoint returns an error to the client if the k8s ping fails but does not record that fact on contour.

I don't want to log this at the contour level as it could cause log spam if the health check is flaky.

We should add metrics around this.

  • [ ] /healthz hits
  • [ ] successful responses
  • [ ] unsuccessful responses

davecheney avatar Sep 23 '19 00:09 davecheney

@davecheney mind if I grab this one?

IngCr3at1on avatar Sep 27 '19 16:09 IngCr3at1on

@IngCr3at1on go for it. Please discuss your design here or in a design document before coding. Thank you.

davecheney avatar Sep 29 '19 22:09 davecheney

@davecheney yep.. I realized right after asking that I was leaving on a vacation but I looked at it a bit, doesn't look too bad will take another look and get something written up here.

IngCr3at1on avatar Oct 07 '19 15:10 IngCr3at1on

Requires changing internal/metrics/metrics.go

func registerHealthCheck(mux *http.ServeMux, client *kubernetes.Clientset) {

to

func registerHealthCheck(mux *http.ServeMux, client *kubernetes.Clientset, registry *prometheus.Registry) {

and change

registerHealthCheck(&svc.ServeMux, svc.Client)
registerMetrics(&svc.ServeMux, svc.Registry)

to

registerHealthCheck(&svc.ServeMux, svc.Client, svc.Registry)
registerMetrics(&svc.ServeMux, svc.Registry)

before creating the handler function in registerHealthCheck register the 3 required gauges for health and adjust these directly from the handler function.

Needs a separate table test in internal/metrics/metrics_test.go using httptest.

IngCr3at1on avatar Oct 08 '19 23:10 IngCr3at1on

@IngCr3at1on sounds fine to me.

davecheney avatar Oct 09 '19 01:10 davecheney