[Bug] Malicious validator can send fake BlockResponse to block honest validators from processing messages
🐛 Bug Report
- Block.Transactions limits the maximum number of transactions to 1048575 in the deserialization logic https://github.com/AleoNet/snarkVM/blob/454d555a0ee1478dce5fa1822b64525a361b6b27/ledger/block/src/transactions/bytes.rs#L29-L32
// Ensure the number of transactions is within bounds.
if num_txs as usize > Self::MAX_TRANSACTIONS {
return Err(error("Failed to read transactions: too many transactions"));
}
- Malicious validators can construct BlockResponse messages by filling a large number of fake transactions
- The victim's message processing logic is stuck because of the deserialization of a large number of fake transactions https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L634-L635
- The victim's block synchronization may stop due to being stuck on deserialization, see: https://github.com/ProvableHQ/snarkOS/pull/6
Your Environment
snarkOS Version: cf83035ab79907329208a7f4e35d77e8e49d0596
I think there are two problems to solve:
- Refer to the changes in https://github.com/AleoNet/snarkOS/pull/3304 and put the deserialization work into the rayon thread
- Appropriately reduce MAX_TRANSACTIONS (I remember that each certificate can only have a maximum of 50 transactions)
@joske can you confirm if this is an issue (and the related PR resolves it?)
I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.
I don't know about the MAX_TRANSACTIONS
I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.
I don't know about the MAX_TRANSACTIONS
I've looked at change 3304, which is a change to node/router/src/inbound.rs, but the problem described there is within the BFT module: https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L629-L644
The value of MAX_TRANSACTIONS is hardcoded to 1048575, which means that a fake BlockResponse may be filled with 1048575 transactions, which will make deserialization extremely time-consuming.
it's probably a good idea to defer this deserialization to rayon in Gateway too, yes.
On the MAX_TRANSACTIONS front, there are 2 limiters:
- The protocol limit allows up to
2^20transactions according to our merkle tree standard. - The BFT worst case scenario formula is
MAX_NUM_VALIDATORS * NUM_TRANSACTIONS_IN_PROPOSAL * MAX_GC_ROUNDSas defined byTransactions::<N>::MAX_ABORTED_TRANSACTIONS. In testnet, this value is 100 * 50 * 100 = 500,000. Which is already large enough that the problem still persists
I would say that this problem needs a more fundamental fix/avoidance than just lowering the size check.