zally
zally copied to clipboard
Adding Prometheus metrics to Zally server
Hi there, for our own dev stack we wanted to scrape metrics into our grafana dashboards showing historical information on Violations for each individual API.
The idea is that we poll the ApiReview
table for available reviews, and based on the name of the API schema we expose any present violations to the /prometheus
endpoint so it can be scraped.
I have already written the code needed to expose the prometheus endpoint as well as some code to expose violation metrics.
it looks like this is similar to /review-statistics
. Could you just use that?
Similar but different in that /review-statistics
returns an aggregated view of all api reviews.
I could modify the existing endpoint to be more detailed (so show stats for api reviews grouped by api name for instance).
However exposing the prometheus endpoint using micrometer is also important since we want to be able to scrape metrics from it and make dashboards in visualization tools like grafana.
OK, I guess that makes sense.
the endpoint also needs to be added to OAuthConfiguration - otherwise it can't be accessed.
@vadeg can you also check this issue?
@WilliamTwill Thanks for your great contribution, and sorry for the bad support in the last month. I like your request and think we should merge it. However, before I would like to understand your label
-approach. Can you please elaborate on how reviews are supposed to be labeled and what do you exactly want to achieve with that?
@WilliamTwill Thanks for your great contribution, and sorry for the bad support in the last month. I like your request and think we should merge it. However, before I would like to understand your
label
-approach. Can you please elaborate on how reviews are supposed to be labeled and what do you exactly want to achieve with that?
@tkrop So the change is two-fold.
-
Exposure of Prometheus metrics. In general this means Zally can be monitored by prometheus (and visualized with for example Grafana). By using these metrics, we can for example auto-scale Zally server in a kubernetes cluster based on the usage metrics coming from Prometheus if there are many teams validating their API changes at the same time.
-
Custom labels. As of right now, my change will allow the passing of a key-value pair when requesting a validation. These labels are also exposed within the prometheus metrics. In our specific case we pass the git branch (feature/x, master, etc) as a key-value pair ({"branch": "feature/x"}) so that we can distinguish validations in the database (and thus the exposed prometheus metrics).
On top of these metrics we have built a Grafana dashboard for our dev teams which shows how many (must, should, may) violations their services have. Of course we also inject any violations coming back from Zally into merge requests directly as comments (as per #648 (comment)). So our devs can immediately spot violations during the dev cycle. The reason we also publish the API violation results to a dashboard is to prove we work using high-quality standards to outside the teams.