core-rs-albatross icon indicating copy to clipboard operation
core-rs-albatross copied to clipboard

Macro block production can stop when validators fail to send or receive messages.

Open nibhar opened this issue 3 years ago • 1 comments

Handel does need network activity to generate network activity. There is an initial LevelUpdate that is send, which is supposed to trigger the remaining peers to create new aggregates and thus new messages.

In rare circumstances however these messages can fail to be received or send. If that happens for all peers that can lead to aggregations stopping as no more network messages are generated if none are received.

This effect can observed the easiest with just 2 validators, but it theoretically extends beyond that with a smaller possibility. It also ist not the desired design in general and thus needs to be changed.

nibhar avatar Nov 13 '21 17:11 nibhar

This behavior can be observed in several of the CI executions, when the 4 validators scenario fails because blocks stopped being produced after some timeout. Some potential examples of this issue can be observed in: https://github.com/nimiq/core-rs-albatross/runs/4212024885?check_suite_focus=true https://github.com/nimiq/core-rs-albatross/runs/4210729994?check_suite_focus=true https://github.com/nimiq/core-rs-albatross/runs/4204373089?check_suite_focus=true

viquezclaudio avatar Nov 15 '21 15:11 viquezclaudio