boulder icon indicating copy to clipboard operation
boulder copied to clipboard

bdns: reduce cardinality of histograms

Open jsha opened this issue 3 years ago • 0 comments

Using the query from https://www.robustperception.io/dropping-metrics-at-scrape-time-with-prometheus: topk(20, count by (__name__, job)({__name__=~".+"})), I found that the histograms in bdns are far and away our highest cardinality metrics.

image

That's because they are histograms (already high cardinality), and have many labels, and some of the label values have high cardinality (for instance, the resolver).

We should tweak these: for the histogram metrics, we are mostly interested in the time to do various operations, so we should remove almost all labels. We can then put those labels on counter metrics, which inherently have lower cardinality. This will reduce our storage and memory requirements for Prometheus, and will make querying these DNS histograms cheaper.

jsha avatar May 25 '22 22:05 jsha