[feature] Introduce metrics aggregation by labelNames
Currently, in Prometheus client, we don't have metrics aggregations, the exposed metrics data is the origin data. For a example:
Counter c = Counter.build("metrics_name", "help").labelNames("cluster", "namespace", "topic").create();
c. labels("a1", "b1", "c1").inc();
c. labels("a1", "b1", "c2").inc();
c. labels("a1", "b2", "c3").inc();
c. labels("a1", "b2", "c3").inc();
the exposed metrics as below:
metrics_name_total{cluster="a1", namespace="b1", topic="c1"} 1
metrics_name_total{cluster="a1", namespace="b1", topic="c2"} 1
metrics_name_total{cluster="a1", namespace="b2", topic="c3"} 1
metrics_name_total{cluster="a1", namespace="b2", topic="c4"} 1
But in some conditions, we want to expose the metrics in custom levels. Say, expose metrics data in cluster level as below:
metrics_name_total{cluster="a1"} 4
or in [cluster, namespace] level as below:
metrics_name_total{cluster="a1", namespace="b1"} 2
metrics_name_total{cluster="a1", namespace="b2"} 2
If this request is feasible, it will greatly reduce the pressure on the Prometheus Server side. And also benefits to the Client side, because it can reduce the size of the response body.
Implementation
Aggregator
We can introduce 2 aggregators, SUM and AVG.
For COUNTER/GAUGE/HISTOGRAM, we can apply the SUM aggregator to them, for SUMMARY, we can apply AVG and SUM aggregators to it.
Gauge:
Gauge g = Gauge.build("metrics_name", "help").labelNames("cluster", "namespace", "topic").create();
g. labels("a1", "b1", "c1").inc();
g. labels("a1", "b1", "c2").inc();
g. labels("a1", "b2", "c3").inc();
g. labels("a1", "b2", "c3").inc();
The origin data:
metrics_name{cluster="a1", namespace="b1", topic="c1"} 1
metrics_name{cluster="a1", namespace="b1", topic="c2"} 1
metrics_name{cluster="a1", namespace="b2", topic="c3"} 1
metrics_name{cluster="a1", namespace="b2", topic="c4"} 1
Aggregate in cluster level:
metrics_name{cluster="a1"} 4
Aggregate in [cluster, namespace] level:
metrics_name{cluster="a1", namespace="b1"} 2
metrics_name{cluster="a1", namespace="b2"} 2
Counter:
Counter c = Counter.build("metrics_name", "help").labelNames("cluster", "namespace", "topic").create();
c. labels("a1", "b1", "c1").inc();
c. labels("a1", "b1", "c2").inc();
c. labels("a1", "b2", "c3").inc();
c. labels("a1", "b2", "c3").inc();
The origin data:
metrics_name_total{cluster="a1", namespace="b1", topic="c1"} 1
metrics_name_total{cluster="a1", namespace="b1", topic="c2"} 1
metrics_name_total{cluster="a1", namespace="b2", topic="c3"} 1
metrics_name_total{cluster="a1", namespace="b2", topic="c4"} 1
Aggregate in cluster level:
metrics_name_total{cluster="a1"} 4
Aggregate in [cluster, namespace] level:
metrics_name_total{cluster="a1", namespace="b1"} 2
metrics_name_total{cluster="a1", namespace="b2"} 2
Histogram:
Histogram h = Histogram.build("metrics_name", "help").buckets(100, 200, 500).create();
h.labels("a1", "b1", "c1").observe(50);
h.labels("a1", "b1", "c1").observe(150);
h.labels("a1", "b1", "c1").observe(400);
h.labels("a1", "b1", "c2").observe(50);
h.labels("a1", "b1", "c2").observe(150);
h.labels("a1", "b1", "c2").observe(400);
h.labels("a1", "b2", "c3").observe(50);
h.labels("a1", "b2", "c3").observe(150);
h.labels("a1", "b2", "c3").observe(400);
h.labels("a1", "b2", "c4").observe(50);
h.labels("a1", "b2", "c4").observe(150);
h.labels("a1", "b2", "c4").observe(400);
The origin data:
metrics_name_bucket{cluster="a1",namespace="b1",topic="c1",le="100.0",} 1.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c1",le="200.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c1",le="500.0",} 3.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c1",le="+Inf",} 3.0
metrics_name_count{cluster="a1",namespace="b1",topic="c1",} 3.0
metrics_name_sum{cluster="a1",namespace="b1",topic="c1",} 600.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c2",le="100.0",} 1.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c2",le="200.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c2",le="500.0",} 3.0
metrics_name_bucket{cluster="a1",namespace="b1",topic="c2",le="+Inf",} 3.0
metrics_name_count{cluster="a1",namespace="b1",topic="c2",} 3.0
metrics_name_sum{cluster="a1",namespace="b1",topic="c2",} 600.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c3",le="100.0",} 1.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c3",le="200.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c3",le="500.0",} 3.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c3",le="+Inf",} 3.0
metrics_name_count{cluster="a1",namespace="b2",topic="c3",} 3.0
metrics_name_sum{cluster="a1",namespace="b2",topic="c3",} 600.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c4",le="100.0",} 1.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c4",le="200.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c4",le="500.0",} 3.0
metrics_name_bucket{cluster="a1",namespace="b2",topic="c4",le="+Inf",} 3.0
metrics_name_count{cluster="a1",namespace="b2",topic="c4",} 3.0
metrics_name_sum{cluster="a1",namespace="b2",topic="c4",} 600.0
Aggregate in cluster level:
metrics_name_bucket{cluster="a1",le="100.0",} 4.0
metrics_name_bucket{cluster="a1",le="200.0",} 8.0
metrics_name_bucket{cluster="a1",le="500.0",} 12.0
metrics_name_bucket{cluster="a1",le="+Inf",} 12.0
metrics_name_count{cluster="a1",} 12.0
metrics_name_sum{cluster="a1",} 2400.0
Aggregate in [cluster, namespace] level:
metrics_name_bucket{cluster="a1",namespace="b2",le="100.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b2",le="200.0",} 4.0
metrics_name_bucket{cluster="a1",namespace="b2",le="500.0",} 6.0
metrics_name_bucket{cluster="a1",namespace="b2",le="+Inf",} 6.0
metrics_name_count{cluster="a1",namespace="b2",} 6.0
metrics_name_sum{cluster="a1",namespace="b2",} 1200.0
metrics_name_bucket{cluster="a1",namespace="b1",le="100.0",} 2.0
metrics_name_bucket{cluster="a1",namespace="b1",le="200.0",} 4.0
metrics_name_bucket{cluster="a1",namespace="b1",le="500.0",} 6.0
metrics_name_bucket{cluster="a1",namespace="b1",le="+Inf",} 6.0
metrics_name_count{cluster="a1",namespace="b1",} 6.0
metrics_name_sum{cluster="a1",namespace="b1",} 1200.0
Summary
Unlike the above meters, SUMMARY is special.
For metrics_name_count and metrics_name_sum, we have to use the SUM aggregator.
But for the timeseries with quantile label, I think AVG aggregator is the best choice.
- [x] I'm willing to submit the PR
@tjiuming For the group by label name / Counter scenario, you can write your own Collector and register it. This should also work for other types of aggregation/summation.
Example JUnit test / CounterGroupByCollector
package io.prometheus.client;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import java.util.ArrayList;
import java.util.List;
public class CounterGroupByCollectorTest {
CollectorRegistry registry;
Counter counter;
CounterGroupByCollector counterGroupByCollector;
@Before
public void setUp() {
registry = new CollectorRegistry();
counter = Counter.build("metrics_name", "metrics_name help").labelNames("cluster", "namespace", "topic").create();
counterGroupByCollector = new CounterGroupByCollector(counter);
}
@Test
public void test() {
counter.labels("a1", "b1", "c1").inc();
counter.labels("a1", "b1", "c2").inc();
counter.labels("a1", "b2", "c3").inc();
counter.labels("a1", "b2", "c4").inc();
System.out.println("Group by \"cluster\"...");
counterGroupByCollector.groupBy("cluster");
List<Collector.MetricFamilySamples> mfs = counterGroupByCollector.collect();
for (Collector.MetricFamilySamples samples : mfs) {
for (Collector.MetricFamilySamples.Sample sample : samples.samples) {
System.out.println(String.format("sample [%s]", sample));
}
}
System.out.println("---");
System.out.println("No group by...");
counterGroupByCollector.groupBy(null);
mfs = counterGroupByCollector.collect();
for (Collector.MetricFamilySamples samples : mfs) {
for (Collector.MetricFamilySamples.Sample sample : samples.samples) {
System.out.println(String.format("sample [%s]", sample));
}
}
System.out.println("---");
System.out.println("Group by \"cluster\", \"namespace\"...");
counterGroupByCollector.groupBy("cluster", "namespace");
mfs = counterGroupByCollector.collect();
for (Collector.MetricFamilySamples samples : mfs) {
for (Collector.MetricFamilySamples.Sample sample : samples.samples) {
System.out.println(String.format("sample [%s]", sample));
}
}
System.out.println("---");
System.out.println("Group by \"cluster\", \"namespace\", \"topic\"...");
counterGroupByCollector.groupBy("cluster", "namespace", "topic");
mfs = counterGroupByCollector.collect();
for (Collector.MetricFamilySamples samples : mfs) {
for (Collector.MetricFamilySamples.Sample sample : samples.samples) {
System.out.println(String.format("sample [%s]", sample));
}
}
}
public static class CounterGroupByCollector extends Collector {
private Counter counter;
private String[] groupByLabelNames;
public CounterGroupByCollector(Counter counter) {
this.counter = counter;
}
public void groupBy(String ... labelNames) {
if ((labelNames == null) || (labelNames.length == 0)) {
synchronized (this) {
groupByLabelNames = null;
}
return;
}
if (labelNames.length > counter.labelNames.size()) {
throw new IllegalArgumentException("Group by labels name contains more labels than Counter");
}
List<String> labelNameList = toList(labelNames);
List<String> counterLabelNameList = counter.labelNames;
for (int i = 0; i < labelNameList.size(); i++) {
if (!labelNameList.get(i).equals(counterLabelNameList.get(i))) {
throw new IllegalArgumentException("Group by labels names are not a subset of Counter label names");
}
}
synchronized (this) {
this.groupByLabelNames = labelNames;
}
}
@Override
public List<MetricFamilySamples> collect() {
String[] localGroupByLabelNames;
synchronized (this) {
localGroupByLabelNames = groupByLabelNames;
}
if (localGroupByLabelNames == null) {
return counter.collect();
}
Counter localCounter =
Counter
.build("metrics_name", "metrics_name help")
.labelNames(localGroupByLabelNames).create();
List<Collector.MetricFamilySamples> mfs = counter.collect();
for (Collector.MetricFamilySamples samples : mfs) {
for (Collector.MetricFamilySamples.Sample sample : samples.samples) {
if (sample.name.endsWith("_total")) {
String[] labelValues = sample.labelValues.subList(0, localGroupByLabelNames.length).toArray(new String[localGroupByLabelNames.length]);
localCounter.labels(labelValues).inc(sample.value);
}
}
}
return localCounter.collect();
}
}
private static List<String> toList(String ... values) {
List<String> list = new ArrayList<String>(values.length);
for (String value : values) {
list.add(value);
}
return list;
}
}