lodestar icon indicating copy to clipboard operation
lodestar copied to clipboard

Track Lighthouse scoring of Lodestar

Open dapplion opened this issue 4 years ago • 2 comments

Lighthouse exposes the gossip score of all peers through their HTTP API at /lighthouse/peers. This endpoint can be queried on interval and send metrics to Prometheus.

  • [ ] Run Lodestar and Lighthouse in a local devnet and track score of Lodestar's peer_id over time
  • [ ] Run Lodestar and Lighthouse in Prater, force a direct connection and track Lodestar's peer_id over time
  • [ ] Setup in CI (i.e. with Kurtosis) and local devnet to finalization that tracks Lodestar's score and asserts it's above a threshold

Endpoint specs

Endpoint /lighthouse/peers returns

                    .map(|(peer_id, peer_info)| eth2::lighthouse::Peer {
                        peer_id: peer_id.to_string(),
                        peer_info: peer_info.clone(),
                    })

https://github.com/sigp/lighthouse/blob/79db2d4deb6a47947699d8a4a39347c19ee6e5d6/beacon_node/http_api/src/lib.rs#L2295

pub struct PeerInfo<T: EthSpec> {
    /// The peers reputation
    score: Score,

https://github.com/sigp/lighthouse/blob/79db2d4deb6a47947699d8a4a39347c19ee6e5d6/beacon_node/lighthouse_network/src/peer_manager/peerdb/peer_info.rs#L23

pub struct RealScore {
    /// The global score.
    // NOTE: In the future we may separate this into sub-scores involving the RPC, Gossipsub and
    // lighthouse.
    lighthouse_score: f64,
    gossipsub_score: f64,
    /// We ignore the negative gossipsub scores of some peers to allow decaying without
    /// disconnecting.
    ignore_negative_gossipsub_score: bool,
    score: f64,
    /// The time the score was last updated to perform time-based adjustments such as score-decay.
    #[serde(skip)]
    last_updated: Instant,
}

https://github.com/sigp/lighthouse/blob/79db2d4deb6a47947699d8a4a39347c19ee6e5d6/beacon_node/lighthouse_network/src/peer_manager/peerdb/score.rs#L133

dapplion avatar Mar 14 '22 07:03 dapplion

For Prysm, they also have a debug RPC endpoint for exposing scoring data for all their peers: https://github.com/prysmaticlabs/prysm/blob/9abea200a5aae001b3d7ef77891f24f2693543a7/proto/prysm/v1alpha1/debug.proto#L51

debug scoring has a lot of data so we should have a good idea on why lodestar gets downscored. Also prysm debug logs would be helpful here:

--verbosity=debug --enable-debug-rpc-endpoints

Run these 2 flags with Prysm

Also, use Prysm gateway they will be in json

curl -X GET "http://localhost:3500/eth/v1alpha1/debug/peers" -H "accept: application/json" | jq '.' 

to get a prettified output

For protobufs, this is the schema

https://github.com/prysmaticlabs/prysm/blob/9abea200a5aae001b3d7ef77891f24f2693543a7/proto/prysm/v1alpha1/debug.proto#L132

philknows avatar Mar 15 '22 02:03 philknows

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 15 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Sep 21 '22 03:09 stale[bot]