keep-core
keep-core copied to clipboard
ENG-506 Add performance metrics tracking for key operations
- Introduced a new system to monitor various operations within the Keep Core node, including wallet actions, DKG processes, signing operations, coordination procedures, and network activities.
- Metrics are recorded through a new interface, allowing for optional integration without impacting performance when disabled.
- Updated relevant components to wire in metrics recording, ensuring comprehensive coverage of critical operations.
- Added documentation detailing implemented metrics and their usage.
This enhancement provides better visibility into node performance and health, facilitating monitoring and troubleshooting.
Wallet Dispatcher Metrics (6 metrics)
Location: pkg/tbtc/wallet.go
- ✅
performance_wallet_dispatcher_active_actions(gauge) - ✅
performance_wallet_dispatcher_rejected_total(counter) - ✅
performance_wallet_actions_total(counter) - ✅
performance_wallet_action_success_total(counter) - ✅
performance_wallet_action_failed_total(counter) - ✅
performance_wallet_action_duration_seconds(histogram)
DKG Operations Metrics (6 metrics)
Location: pkg/tbtc/dkg.go
- ✅
performance_dkg_joined_total(counter) - ✅
performance_dkg_failed_total(counter) - ✅
performance_dkg_duration_seconds(histogram) - ✅
performance_dkg_validation_total(counter) - ✅
performance_dkg_challenges_submitted_total(counter) - ✅
performance_dkg_approvals_submitted_total(counter)
Signing Operations Metrics (5 metrics)
Location: pkg/tbtc/signing.go, pkg/tbtc/node.go
- ✅
performance_signing_operations_total(counter) - ✅
performance_signing_success_total(counter) - ✅
performance_signing_failed_total(counter) - ✅
performance_signing_duration_seconds(histogram) - ✅
performance_signing_timeouts_total(counter)
Coordination Operations Metrics (4 metrics)
Location: pkg/tbtc/coordination.go, pkg/tbtc/node.go
- ✅
performance_coordination_windows_detected_total(counter) - ✅
performance_coordination_procedures_executed_total(counter) - ✅
performance_coordination_failed_total(counter) - ✅
performance_coordination_duration_seconds(histogram)
Network Operations Metrics (10 metrics)
Location: pkg/net/libp2p/libp2p.go, pkg/net/libp2p/channel.go, pkg/net/libp2p/channel_manager.go
- ✅
performance_peer_connections_total(counter) - ✅
performance_peer_disconnections_total(counter) - ✅
performance_message_broadcast_total(counter) - ✅
performance_message_received_total(counter) - ✅
performance_incoming_message_queue_size(gauge, withchannellabel) - ✅
performance_message_handler_queue_size(gauge, withchannelandhandlerlabels) - ✅
performance_ping_test_total(counter) - ✅
performance_ping_test_success_total(counter) - ✅
performance_ping_test_failed_total(counter) - ✅
performance_ping_test_duration_seconds(histogram)