charon
charon copied to clipboard
Check and Improve Beacon Node Health Status Logic
🎯 Problem to be solved
The current implementation of the Beacon Node Health Status logic in Charon does not consider the number of connected peers. As a result, a Beacon Node with zero peers is still reported as having a "Health Status" of OK, which may not reflect the operational status of the node.
Resources
This cluster has a Health Status OK despite zero peers in BN (which is supposed to be a health status).
Per the docs for the /readyz
endpoint (which is used for the Health Status gauge)
"Set to 1 if the node is operational and monitoring api
/readyz
endpoint is returning 200s. " "Else/readyz
is returning 500s and this metric is either set to " "2 if the beacon node is down, or" "3 if the beacon node is syncing, or" "4 if quorum peers are not connected."
🛠️ Proposed solution
- [ ] Implement a check for the minimum required number (at least 3) of connected peers
- [ ] Update the Health Status logic to consider a Beacon Node with zero peers as unhealthy
- [ ] Determine the appropriate Health Status value or error code for a Beacon Node with zero peers