incubator-pegasus
incubator-pegasus copied to clipboard
Feature(new_metrics): go-collector supports to collect and expose the metrics to Prometheus
1. Prometheus metrics of old framework
In the old framework, all the metrics have the identical metric labels. The only difference is that for those metrics which have no attributes of app
and partition
, their values will be left empty as following example:
zion_profiler_RPC_RRDB_RRDB_GET_qps{cluster="pegasus_cluster",host_name="abc.xyz",pegasus_job="replica",port="34601",service="pegasus",app="",partition=""} 0.000000
For those metrics which have attributes of app
and partition
their values will certainly left non-empty:
replica_app_pegasus_get_qps{cluster="pegasus_cluster",host_name="abc.xyz",pegasus_job="replica",port="34801",service="pegasus",app="4",partition="2"} 0.000000
2. The Design of Prometheus metrics of new framework
Referring to the old framework and the features of new metrics system, the new Prometheus labels could be designed as below:
Label name | Description | Metric types | Entities |
---|---|---|---|
cluster | cluster name | - | - |
role | role name, i.e. meta/replica | - | - |
host | hostname of the server | - | - |
port | the port of the role instance | - | - |
entity | such as server/table/replica ... | - | - |
table | table id | - | for table/partition/replica entity |
partition | partition id | - | for partition/replica entity |
p | the percentile, for example the value may be "90", "95", "999" | only for Percentile | - |
task | the task name | - | only for profiler entity |
dir | the disk directory | - | only for disk entity |
queue | the queue name | - | only for queue entity |
policy | the policy name | - | only for backup_policy entity |
tracer | the latency_tracer description | - | only for latency_tracer entity |
start | the starting point of latency_trace | - | only for latency_tracer entity |
end | the end point of latency_tracer | - | only for latency_tracer entity |