incubator-pegasus icon indicating copy to clipboard operation
incubator-pegasus copied to clipboard

Feature(new_metrics): go-collector supports to collect and expose the metrics to Prometheus

Open empiredan opened this issue 1 year ago • 0 comments

1. Prometheus metrics of old framework

In the old framework, all the metrics have the identical metric labels. The only difference is that for those metrics which have no attributes of app and partition, their values will be left empty as following example:

zion_profiler_RPC_RRDB_RRDB_GET_qps{cluster="pegasus_cluster",host_name="abc.xyz",pegasus_job="replica",port="34601",service="pegasus",app="",partition=""} 0.000000

For those metrics which have attributes of app and partition their values will certainly left non-empty:

replica_app_pegasus_get_qps{cluster="pegasus_cluster",host_name="abc.xyz",pegasus_job="replica",port="34801",service="pegasus",app="4",partition="2"} 0.000000

2. The Design of Prometheus metrics of new framework

Referring to the old framework and the features of new metrics system, the new Prometheus labels could be designed as below:

Label name Description Metric types Entities
cluster cluster name - -
role role name, i.e. meta/replica - -
host hostname of the server - -
port the port of the role instance - -
entity such as server/table/replica ... - -
table table id - for table/partition/replica entity
partition partition id - for partition/replica entity
p the percentile, for example the value may be "90", "95", "999" only for Percentile -
task the task name - only for profiler entity
dir the disk directory - only for disk entity
queue the queue name - only for queue entity
policy the policy name - only for backup_policy entity
tracer the latency_tracer description - only for latency_tracer entity
start the starting point of latency_trace - only for latency_tracer entity
end the end point of latency_tracer - only for latency_tracer entity

empiredan avatar Dec 29 '23 07:12 empiredan