prom-client icon indicating copy to clipboard operation
prom-client copied to clipboard

Aggregation of t-digest types broken by default

Open vincentwoo opened this issue 6 years ago • 12 comments

In practice, summary data in prometheus can be hard to aggregate, because they lose a bit of data. I've been using a custom IPC setup to push summary data generated on workers to a master's summary metric to avoid integration. However, this is cumbersome and I'd like to use the default metrics aggregator provided by prom-client. Can you speak to how this works in practice? It's a bit difficult for me to figure out.

vincentwoo avatar Oct 14 '18 10:10 vincentwoo

I'm not sure what concretely you're asking? Are you asking how the aggregation works in this library before the scrape by prometheus?

I'm also not sure what "worker" and "master" is here. Child process? Cluster? The new Worker in node 10?

SimenB avatar Oct 14 '18 14:10 SimenB

Yes, sorry for not writing that as well as I could have. I am asking about node's cluster module and specifically about how the summary instrument is aggregated in response to a Prometheus scrape. It seems like the cluster master sends an IPC request to each of the workers, but I'm not sure how the math works out. Other instrument types have straightforward ways to be aggregated, so I'm just curious if anything special needs to be done for summary.

I was motivated because after I switched to using the default aggregator registry for my summary metric, I noticed my metrics looked very different (several times higher in most quantiles.

vincentwoo avatar Oct 14 '18 22:10 vincentwoo

@zbjornson ^

SimenB avatar Oct 15 '18 06:10 SimenB

I think histograms and summaries behave the same as counters (which sum by default) within each individual sum/bucket. Tomorrow I can test to be sure.

zbjornson avatar Oct 15 '18 09:10 zbjornson

Ah, I think you are correct. Here's a test script:

const cluster = require('cluster')
const Prometheus = require('prom-client')

if (cluster.isMaster) {
  for (let i = 0; i < 3; i++) {
    cluster.fork()
  }
  const metricsServer = require('express')()
  const aggregatorRegistry = new Prometheus.AggregatorRegistry()
  metricsServer.get('/metrics', function(req, res) {
    aggregatorRegistry.clusterMetrics((err, metrics) => {
      res.set('Content-Type', aggregatorRegistry.contentType)
      res.end(metrics)
    })
  })
  metricsServer.listen(8000)
} else {
  const summary = new Prometheus.Summary({
    name: 'test',
    help: 'test'
  })
  summary.observe(1)
}

And the output:

$ curl http://localhost:8000/metrics
# HELP test test
# TYPE test summary
test{quantile="0.01"} 3
test{quantile="0.05"} 3
test{quantile="0.5"} 3
test{quantile="0.9"} 3
test{quantile="0.95"} 3
test{quantile="0.99"} 3
test{quantile="0.999"} 3
test_sum 3
test_count 3

I think this is probably not the behavior people would expect with default aggregation. I believe that summaries and histograms used the t-digest primitive, which appears to be built to be combined. If so, it seems like we could send the t-digest data over when summaries need to be aggregated.

Alternatively, might it be better for the library to do something under-the-covers where the master worker holds the "real" in-memory instrument, and the worker nodes, when observing measurements, merely pass those measurements over IPC to the master?

vincentwoo avatar Oct 15 '18 11:10 vincentwoo

Unfortunately I know very little about how histogram and summary metrics work. Both of the options you suggested in your last two paragraphs sound reasonable though. @brian-brazil, @SimenB or @siimon might know what the best approach. Definitely sounds like something to fix.

zbjornson avatar Oct 17 '18 05:10 zbjornson

Those quantiles aren't right, they should all be 1 given that input.

brian-brazil avatar Oct 17 '18 08:10 brian-brazil

We're wondering what the best way to correct that is: send T-digest data for combining on the master, or sending the individual, pre-sum data for summing/combining on the master?

I think summing is the correct thing to do for histograms (since it's reporting absolute counts in each bin), so just summaries need to be corrected.

zbjornson avatar Oct 17 '18 18:10 zbjornson

I think one problem is that the t-digest may be kinda large, so serializing/deserializing on every prometheus request is a bit rough. I think having prom-client perform under-the-cover IPC for observations for all metrics is probably better.

vincentwoo avatar Oct 18 '18 01:10 vincentwoo

Any consensus on approach?

vincentwoo avatar Oct 30 '18 22:10 vincentwoo

There is also a leaking behavior in the solution with t-digest like I described in #142 .

atd-schubert avatar Dec 28 '18 09:12 atd-schubert

I renamed the title to be a bit more on the mark and am bumping the thread! Any news?

vincentwoo avatar Sep 27 '20 06:09 vincentwoo