bee icon indicating copy to clipboard operation
bee copied to clipboard

feat(salud): mark self as unhealthy if unreachable

Open istae opened this issue 2 years ago • 4 comments

Checklist

  • [ ] I have read the coding guide.
  • [ ] My change requires a documentation update, and I have done it.
  • [ ] I have added tests to cover my changes.
  • [ ] I have filled out the description and linked the related issues.

Description

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):

istae avatar Aug 30 '23 12:08 istae

these peers have a radius != 9 and zero neighbors, notice their reachability bad-peers.txt

istae avatar Aug 30 '23 15:08 istae

these peers have a radius != 9 and zero neighbors, notice their reachability bad-peers.txt

Is the low storage radius and reserve because they are unreachable? Or is their low connectedPeers count a direct result of being both unreachable AND the reachable peers not allowing enough incoming connections for such nodes? Just askin'

Or worse, they have low connectedPeers because they have a too-low value for the open files limit (ulimit -n) in the process or service? So many possible explanations for the observed conditions.

ldeffenb avatar Aug 30 '23 15:08 ldeffenb

it's most likely improper setups like firewall and nat issues. The other being hardware limitations. A bee node has to maintain somewhere around 150 tcp connections open and these nodes in the file I posted above probably belong to large node operators so something along the lines of hardware like routers not being able to handle the load may cause the issues.

istae avatar Aug 30 '23 22:08 istae

We will wait until ph4's contract is released to see how the situation is changed. Moving accordingly afterwards.

nikipapadatou avatar Sep 18 '23 11:09 nikipapadatou