ette icon indicating copy to clipboard operation
ette copied to clipboard

Handle bad blocks gracefully

Open arnaldopereira opened this issue 3 years ago • 1 comments

Problem: when an unexpected block is received, ette exists with a log.Fatalf() call. That happens in two cases:

  1. https://github.com/itzmeanjan/ette/blob/main/app/block/listener.go#L52
  2. https://github.com/itzmeanjan/ette/blob/main/app/block/listener.go#L66

The latter states that ette bails because it basically stops relying on the node and that this is not a state of the art solution.

Suggestion: Change this behavior so that ette notifies the user of bad blocks and ignores them, instead of letting ette itself making the decision of stop relying on the node. On production, or pseudo-production environments, ette would be running on a supervised way, so that a supervisor would bring it up again in those situations. This means the strategy of firing a log.Fatalf() doesn't even have the expected behavior, but instead have only a bad side effect: it makes ette offline for a few seconds and then it gets back online, feeding from the same node again.

arnaldopereira avatar Jun 16 '21 13:06 arnaldopereira

Hm I see, but how do you envision supervisor being notified ?

What I can think of, bringing in some message queue, where it periodically puts health status, which is checked by supervisor. Or may be exposing some health status denoting endpoint, which can be periodically polled by supervisor.

itzmeanjan avatar Jun 23 '21 03:06 itzmeanjan

I've stopped maintaining ette, so closing this issue. Thanks.

itzmeanjan avatar Jan 13 '23 12:01 itzmeanjan