snarkOS
snarkOS copied to clipboard
fix: limit outstanding certificate requests
This attempts to fix https://github.com/AleoHQ/snarkOS/issues/2935.
I was able to reproduce using the following malicious code (credit @niklaslong): https://github.com/niklaslong/snarkOS/tree/tmp/2935-replication
Run one node with this code, and the rest with normal testnet3. After 20 rounds, the malicious node will send fake certificate requests.
With this fix, the nodes reject the bad certs after the max outstanding requests of 20 is reached, and then carry on, before all nodes would crash with stack overflow.
The MAX_OUTSTANDING_CERTIFICATE_REQUESTS is chosen arbitrarily at 20, please change as you see fit.
I am the reporter of #2935. The root cause of the issue is that the validator didn't verify the certificate before entering recursion. I think we can resolve the issue more directly:
- Upon receiving a
BatchPropose, verify the direct parent certificates are all valid. - Upon receiving a
Certificatethrough broadcast, immediately verify if it is valid.
Also, we may support Batch Certificate Request and avoid using recursion.
I am the reporter of #2935. The root cause of the issue is that the validator didn't verify the certificate before entering recursion. I think we can resolve the issue more directly:
- Upon receiving a
BatchPropose, verify the direct parent certificates are all valid.- Upon receiving a
Certificatethrough broadcast, immediately verify if it is valid.Also, we may support
Batch Certificate Requestand avoid using recursion.
Verify the BatchPropose or BatchCertificate rely on the previous certificates. And we don't know the committee has changed or not when receive a BatchPropose or BatchCertificate.
I am the reporter of #2935. The root cause of the issue is that the validator didn't verify the certificate before entering recursion. I think we can resolve the issue more directly:
- Upon receiving a
BatchPropose, verify the direct parent certificates are all valid.- Upon receiving a
Certificatethrough broadcast, immediately verify if it is valid.Also, we may support
Batch Certificate Requestand avoid using recursion.Verify the
BatchProposeorBatchCertificaterely on the previous certificates. And we don't know the committee has changed or not when receive aBatchProposeorBatchCertificate.
Yes, this is a problem for the current implementation.
I submitted another bug report before, which is about the committee update. The report shows that updating the committee at every block will cause some consensus issues. Maybe we can fix that issue first to make it easier to validate the certificate.
@randomsleep @ghostant-1017 Is the issue still present in latest commit of mainnet-staging? If there is a better design, we can close this PR and work on a new issue.
@howardwu I believe this issue still exists in mainnet-staging. Now the committee is fixed for every round. We can address this issue by:
- Upon receiving a Certificate, immediately check if it has reached quorum before recursion. If the Certificate is the parent of an already checked Certificate, we can skip the check.
- Upon receiving a BatchPropose, immediately perform some checks before recursion. Make sure the direct parent Certificate has reached quorum and
parent_round == batch_propose_round + 1