Anastasios Kichidis
Anastasios Kichidis
### Description Following the discussions about the [disk clean up](https://www.notion.so/mystenlabs/Design-doc-Disk-Cleanup-GC-41c4eab651334ec2bbfd822a0c412c88) it appears that we don't have a good reason to store the headers in the `header_store` in first place. It...
### Description On a few places in our codebase we check for the certificates validity. Some places are: * the [core](https://github.com/MystenLabs/narwhal/blob/4bbbd6e58bb86edcff570bbe03ad209bf72bfa58/primary/src/core.rs#L370) * the [block_synchronizer](https://github.com/MystenLabs/narwhal/blob/4bbbd6e58bb86edcff570bbe03ad209bf72bfa58/primary/src/block_synchronizer/responses.rs#L112) we should trace those occurancies via...
### Description We are using a few timeout configuration here and there already on multiple components when retrieving resources. Some examples: the `block_synchronizer` timeout configs: https://github.com/MystenLabs/narwhal/blob/fdd4f0b45b1d9bf499fb7822d8e1d36144b09fe6/config/src/lib.rs#L141 Similarly we do have...
## Steps to Reproduce Issue On a 4 validator cluster we let the rounds advance and then we shutdown the 2 out of 4 nodes (f+1 failures). The protocol should...
## Steps to Reproduce Issue On a 4 validator cluster we let the rounds advance and then we shutdown the 2 out of 4 nodes (`f+1` failures). The protocol should...
During a crash recovery we are sending to the executor all the certificates that `have been committed` but `not executed` yet. To achieve that both the `consensus_store` and `state` are...
## Description On the latest incident we observed faucet erroring and not able to process requests. More specifically in DevNet we started seeing errors (after 8:00 am GTM on 03/11/2022)...
This PR is fixing the recovery flow for Consensus. What has been observed in DevNet are cases where the Consensus dag was restored and nodes started committing again certificates that...