zally icon indicating copy to clipboard operation
zally copied to clipboard

Adding Prometheus metrics to Zally server

Open WilliamTwill opened this issue 4 years ago • 5 comments

Hi there, for our own dev stack we wanted to scrape metrics into our grafana dashboards showing historical information on Violations for each individual API.

The idea is that we poll the ApiReview table for available reviews, and based on the name of the API schema we expose any present violations to the /prometheus endpoint so it can be scraped.

I have already written the code needed to expose the prometheus endpoint as well as some code to expose violation metrics.

WilliamTwill avatar Jul 01 '20 15:07 WilliamTwill

it looks like this is similar to /review-statistics. Could you just use that?

zeitlinger avatar Jul 06 '20 08:07 zeitlinger

Similar but different in that /review-statistics returns an aggregated view of all api reviews. I could modify the existing endpoint to be more detailed (so show stats for api reviews grouped by api name for instance). However exposing the prometheus endpoint using micrometer is also important since we want to be able to scrape metrics from it and make dashboards in visualization tools like grafana.

WilliamTwill avatar Jul 06 '20 08:07 WilliamTwill

OK, I guess that makes sense.

the endpoint also needs to be added to OAuthConfiguration - otherwise it can't be accessed.

@vadeg can you also check this issue?

zeitlinger avatar Jul 06 '20 09:07 zeitlinger

@WilliamTwill Thanks for your great contribution, and sorry for the bad support in the last month. I like your request and think we should merge it. However, before I would like to understand your label-approach. Can you please elaborate on how reviews are supposed to be labeled and what do you exactly want to achieve with that?

tkrop avatar Sep 11 '20 09:09 tkrop

@WilliamTwill Thanks for your great contribution, and sorry for the bad support in the last month. I like your request and think we should merge it. However, before I would like to understand your label-approach. Can you please elaborate on how reviews are supposed to be labeled and what do you exactly want to achieve with that?

@tkrop So the change is two-fold.

  1. Exposure of Prometheus metrics. In general this means Zally can be monitored by prometheus (and visualized with for example Grafana). By using these metrics, we can for example auto-scale Zally server in a kubernetes cluster based on the usage metrics coming from Prometheus if there are many teams validating their API changes at the same time.

  2. Custom labels. As of right now, my change will allow the passing of a key-value pair when requesting a validation. These labels are also exposed within the prometheus metrics. In our specific case we pass the git branch (feature/x, master, etc) as a key-value pair ({"branch": "feature/x"}) so that we can distinguish validations in the database (and thus the exposed prometheus metrics).

On top of these metrics we have built a Grafana dashboard for our dev teams which shows how many (must, should, may) violations their services have. Of course we also inject any violations coming back from Zally into merge requests directly as comments (as per #648 (comment)). So our devs can immediately spot violations during the dev cycle. The reason we also publish the API violation results to a dashboard is to prove we work using high-quality standards to outside the teams.

WilliamTwill avatar Sep 11 '20 09:09 WilliamTwill