lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Add additional PeerDAS metrics

Open jimmygchen opened this issue 1 year ago • 2 comments

Description

Add additional PeerDAS metrics to provide us more visibility on the node.

Suggested list from Andrew:

just checked out lighthouse das branch specific metrics;
- Full runtime of data column sidecars gossip verification (counter)
- Number of data column sidecars verified for gossip (counter)
- Total count of reconstructed columns (counter)
- Time taken to reconstruct columns (histogram)
- Time taken to compute data column sidecar, including cells, proofs and inclusion proof (histogram)
- Time taken to verify data_column sidecar inclusion proof (histogram)
- Runtime of batched data column kzg verification (histogram)
- Runtime of single data column kzg verification (histogram)

Counts on custody could be useful. `column_index` as labels?
- Total count of columns in custody (counter)

Gossipsub domain metrics will be very handy imo. Should contain topic `data_column_sidecar_{subnet_id}` as labels;
- Number of gossip messages sent to each topic (counter)
- Number of bytes sent to each topic (counter)
- Number of gossip messages received from each topic (including duplicates) (counter)
- Number of bytes received from each topic (including duplicates) (counter)
- Number of gossip messages received from each topic (deduplicated) (counter)
- Number of bytes received from each topic (deduplicated) (counter)

Req/Resp domain metrics would be nice to have. Should contain protocol ID `/eth2/beacon_chain/req/data_column_sidecars_by_root/1/` and `/eth2/beacon_chain/req/data_column_sidecars_by_range/1/` as labels;
- Number of requests sent (counter)
- Number of requests received (counter)
- Number of responses sent (counter)
- Number of responses bytes sent (counter)
- Number of responses received (counter)
- Number of responses bytes received (counter)

jimmygchen avatar Jun 28 '24 15:06 jimmygchen

I think we have most of those gossip metrics

  • gossipsub_topic_msg_recv_bytes_total: number of bytes received through gossip
  • gossipsub_topic_msg_sent_bytes_total: number of bytes sent through gossip
  • gossipsub_topic_msg_recv_counts_unfiltered_total: number of unfiltered gossip topics received
  • gossipsub_topic_msg_recv_counts_total: number of gossip msg received
  • number of duplicates can be calculated from the above two metrics

We're lacking metrics on bytes sent / received over rpc, I've raised an issue for this (https://github.com/sigp/lighthouse/issues/6114)

There's a metrics for overall libp2p bandwidth libp2p_bandwidth_bytes_total, which provides a rough idea of how much bytes sent / received that are not through gossip.

There is a "Network" dashboard (not up to date with PeerDAS) that includes usage of these metrics https://github.com/sigp/lighthouse-metrics/blob/master/dashboards/Network.json

and a draft dashboard for PeerDAS https://github.com/sigp/lighthouse-metrics/pull/54

jimmygchen avatar Jul 23 '24 00:07 jimmygchen

@KatyaRyazantseva is helping us with this as part of #6248 🙏

jimmygchen avatar Aug 21 '24 07:08 jimmygchen