fabric
fabric copied to clipboard
Add performance metrics to the gateway service
Add Fabric prometheus/statsd metrics to the gateway service component.
Closed the related fabric-gateway issue: https://github.com/hyperledger/fabric-gateway/issues/346
hello everyone, I am going to providing some reference and my personal opinion according to the white paper
if I am correct, "gateway component" from "business" point of view considering is better to have report related with
- read latency
- read throughput
- write latency
- write throughput
for read, I am not sure if gateway will do a query on different peers or just local peer. if local peer only, then it seems no need for us to considering any read metrics. it will nearly no different with we make those metrics at client/application side.
for write, of course, it should wait for endorsement, orderer, validation.... but we should be care of that "network threshold" as block been committed at how many peers but not single peer. So far for "network threshold", a way to calculate it happens on prometheus side or Grafana side.
For this part, @denyeart please help to confirm things below, btw, is there any high level gateway design/workflow document can I take a look at as reference? https://hyperledger-fabric.readthedocs.io/en/latest/gateway.html so far just have endorsement phase description.
- [ ] for read, are we going to compare a query result from different peers for gateway before gateway send response back to client?
- [ ] for write, will gateway waiting for numbers of peers confirmed the block before gateway send response back to client?
meanwhile those metrics is major focusing among all peers at network, and we are going to use some metrics listener such as "prometheus" to get data from different peers. and we have those metrics "somehow" existing as a part of peer metrics.
Personally, I am expecting the metrics for gateway can show if the gateway is a "bottleneck" for current peer node or not. as far as I know gateway is together with peer, and in an edge case, traffics hold on gateway and never been orderer. (in case raft issue, block never came) will gateway eat up memory for peer node? with edge case above I am not sure if we should add system metrics as size of Tx on gateway queue waiting for process?
As a summary to end today's comments, I am focusing on below parts for metrics
- Report gateway system status, as Tx in Queue, memory usage etc ?
- Different with Peer existing metrics and show read/write performance, instead of calculate at prometheus side or Grafana side. If read / write operator considering other peers.