snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

[Bug] Malicious validator can send fake BlockResponse to block honest validators from processing messages

Open elderhammer opened this issue 1 year ago • 6 comments

🐛 Bug Report

  1. Block.Transactions limits the maximum number of transactions to 1048575 in the deserialization logic https://github.com/AleoNet/snarkVM/blob/454d555a0ee1478dce5fa1822b64525a361b6b27/ledger/block/src/transactions/bytes.rs#L29-L32
        // Ensure the number of transactions is within bounds.
        if num_txs as usize > Self::MAX_TRANSACTIONS {
            return Err(error("Failed to read transactions: too many transactions"));
        }
  1. Malicious validators can construct BlockResponse messages by filling a large number of fake transactions
  2. The victim's message processing logic is stuck because of the deserialization of a large number of fake transactions https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L634-L635
  3. The victim's block synchronization may stop due to being stuck on deserialization, see: https://github.com/ProvableHQ/snarkOS/pull/6

Your Environment

snarkOS Version: cf83035ab79907329208a7f4e35d77e8e49d0596

elderhammer avatar Jun 14 '24 11:06 elderhammer

I think there are two problems to solve:

  1. Refer to the changes in https://github.com/AleoNet/snarkOS/pull/3304 and put the deserialization work into the rayon thread
  2. Appropriately reduce MAX_TRANSACTIONS (I remember that each certificate can only have a maximum of 50 transactions)

elderhammer avatar Jun 14 '24 11:06 elderhammer

@joske can you confirm if this is an issue (and the related PR resolves it?)

vicsn avatar Jun 14 '24 19:06 vicsn

I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.

I don't know about the MAX_TRANSACTIONS

joske avatar Jun 17 '24 13:06 joske

I think #3304 should avoid the node getting stuck on big blocks. That said, there's still a deadlock in sync that we haven't found yet.

I don't know about the MAX_TRANSACTIONS

I've looked at change 3304, which is a change to node/router/src/inbound.rs, but the problem described there is within the BFT module: https://github.com/AleoNet/snarkOS/blob/cf83035ab79907329208a7f4e35d77e8e49d0596/node/bft/src/gateway.rs#L629-L644

The value of MAX_TRANSACTIONS is hardcoded to 1048575, which means that a fake BlockResponse may be filled with 1048575 transactions, which will make deserialization extremely time-consuming.

elderhammer avatar Jun 17 '24 13:06 elderhammer

it's probably a good idea to defer this deserialization to rayon in Gateway too, yes.

joske avatar Jun 17 '24 14:06 joske

On the MAX_TRANSACTIONS front, there are 2 limiters:

  1. The protocol limit allows up to 2^20 transactions according to our merkle tree standard.
  2. The BFT worst case scenario formula is MAX_NUM_VALIDATORS * NUM_TRANSACTIONS_IN_PROPOSAL * MAX_GC_ROUNDS as defined by Transactions::<N>::MAX_ABORTED_TRANSACTIONS. In testnet, this value is 100 * 50 * 100 = 500,000. Which is already large enough that the problem still persists

I would say that this problem needs a more fundamental fix/avoidance than just lowering the size check.

raychu86 avatar Jul 17 '24 23:07 raychu86