lighthouse
lighthouse copied to clipboard
Add additional PeerDAS metrics
Description
Add additional PeerDAS metrics to provide us more visibility on the node.
Suggested list from Andrew:
just checked out lighthouse das branch specific metrics;
- Full runtime of data column sidecars gossip verification (counter)
- Number of data column sidecars verified for gossip (counter)
- Total count of reconstructed columns (counter)
- Time taken to reconstruct columns (histogram)
- Time taken to compute data column sidecar, including cells, proofs and inclusion proof (histogram)
- Time taken to verify data_column sidecar inclusion proof (histogram)
- Runtime of batched data column kzg verification (histogram)
- Runtime of single data column kzg verification (histogram)
Counts on custody could be useful. `column_index` as labels?
- Total count of columns in custody (counter)
Gossipsub domain metrics will be very handy imo. Should contain topic `data_column_sidecar_{subnet_id}` as labels;
- Number of gossip messages sent to each topic (counter)
- Number of bytes sent to each topic (counter)
- Number of gossip messages received from each topic (including duplicates) (counter)
- Number of bytes received from each topic (including duplicates) (counter)
- Number of gossip messages received from each topic (deduplicated) (counter)
- Number of bytes received from each topic (deduplicated) (counter)
Req/Resp domain metrics would be nice to have. Should contain protocol ID `/eth2/beacon_chain/req/data_column_sidecars_by_root/1/` and `/eth2/beacon_chain/req/data_column_sidecars_by_range/1/` as labels;
- Number of requests sent (counter)
- Number of requests received (counter)
- Number of responses sent (counter)
- Number of responses bytes sent (counter)
- Number of responses received (counter)
- Number of responses bytes received (counter)
I think we have most of those gossip metrics
gossipsub_topic_msg_recv_bytes_total: number of bytes received through gossipgossipsub_topic_msg_sent_bytes_total: number of bytes sent through gossipgossipsub_topic_msg_recv_counts_unfiltered_total: number of unfiltered gossip topics receivedgossipsub_topic_msg_recv_counts_total: number of gossip msg received- number of duplicates can be calculated from the above two metrics
We're lacking metrics on bytes sent / received over rpc, I've raised an issue for this (https://github.com/sigp/lighthouse/issues/6114)
There's a metrics for overall libp2p bandwidth libp2p_bandwidth_bytes_total, which provides a rough idea of how much bytes sent / received that are not through gossip.
There is a "Network" dashboard (not up to date with PeerDAS) that includes usage of these metrics https://github.com/sigp/lighthouse-metrics/blob/master/dashboards/Network.json
and a draft dashboard for PeerDAS https://github.com/sigp/lighthouse-metrics/pull/54
@KatyaRyazantseva is helping us with this as part of #6248 🙏