snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

[Feature] Add additional `Validator Telemetry` metrics

Open raychu86 opened this issue 9 months ago • 3 comments

🚀 Feature

The Validator Telemetry module introduced by https://github.com/ProvableHQ/snarkOS/pull/3516 is a very simple tracker for validator participation. There are many other metrics that can be tracked to further improve validator monitoring to ensure network safety.

Some examples of additional metrics to track:

  • [x] Response time. This can be inferred from the logs if needed.
  • Sync rate
  • Percentage of proposals that are converted into certificates
  • Fullness of proposals
  • [x] Connectivity (how many other validators are they connected to)
  • Various stake weight considerations
  • [x] The latest seen IP address of each validator (useful for debugging purposes). This is already present in logs after: Connected to 1 validators (of 3 bonded validators)
  • etc.

Additionally, the metrics are currently tracked via logs and a REST endpoint. These should be surfaced to the Prometheus metrics level as well.

raychu86 avatar Mar 31 '25 14:03 raychu86

Needs further tweaking on snarkVM metrics side.

Relevant PR - https://github.com/ProvableHQ/snarkOS/pull/3064

raychu86 avatar Jul 30 '25 20:07 raychu86

Other related topics to report one way or another:

  • validator height: tackled by https://github.com/ProvableHQ/snarkOS/pull/3968
  • version: https://github.com/ProvableHQ/snarkOS/pull/3971#pullrequestreview-3377444240
  • machine specs: besides the complexity, it is unclear how necessary this is at this stage, as this should directly reflect in their performance. Note the adage "alert on symptoms not on causes"

vicsn avatar Sep 30 '25 15:09 vicsn

Currently excited about:

  • break up telemetry measurement
  • Percentage of proposals that are converted into certificates
  • Percentage of certificates which make it to subdag
  • Duplication of transmissions
  • Fullness of proposals

vicsn avatar Oct 27 '25 17:10 vicsn