namada
namada copied to clipboard
Replay protection space optimization
Currently we implemented a hash-based replay protection mechanism that involves writing two hashes (32 bytes each) to storage. This set of hashes is, currently, monotonically increasing since removing hashes would allow replay attacks.
To save some storage space we could think about a mechanism that would allow us to remove the hashes: to do so we'd need to leverage the expiration
field of a transaction. This field defines the last instant of time within which the tx must be executed: once this timestamp is passed, the transaction will become invalid and never applied. We can therefore remove the hashes of expired transactions from storage since these would be invalid anyway.
To accomplish this we need to:
- Make the
expiration
field mandatory (it'sOptional
atm), or at least change the validation process to reject any wrapper tx without an expiration - Set a reasonable limit for the expiration time (e.g. 1 day) to enforce in protocol so that it's impossible for user to set expiration times too far in the future (which would defy the purpose)
- Move the hashes outside the subspace of the db to prevent writing diff keys (which would actually increase the space occupation), this would also improve space occupation by itself since it would prevent the writing of diff keys
- Attach the expiration associated with a transaction hash in storage
- Generate a garbage collector that periodically looks for expired txs' hashes and remove them from storage
- EXTRA UNRELATED STEP if the inner transaction hash is indeed committed to storage (e.g. no out-of-gas, invalid sig, etc...) than the hash of the wrapper is redundant since the wrapper cannot be replayed anyway (because of the check on the inner tx hash) meaning that we can remove the wrapper hash from storage for further space saving (especially since we can expect most of the inner txs to be valid)
The search for expired txs might be computationally expensive since the hashes are stored in the db key while the expiration will be the associated value (linear search): we might think about optimizing this without an excessive overhead on the storage (again to avoid going against the entire goal of this proposal).
We also need to preserve the correct behavior of the rollback
command: we can take advantage of the fact that a rollback is limited to the previous block height, so we could store these hashes under a special key in storage just for one height for this purpose. This mechanism should also be applied to the deletions that we operate outside of this pruning mechanism (e.g. hash deletion for out-of-gas error). Alternatively, only for this latter cases (non-pruning deletions) we could rely on the rolled-back tx-queue which also contains the hashes of the sections.
Regarding the db-dump
command, given #1192, it will be possible to dump the state at a custom height: in this case we'll simply avoid dumping the tx hashes which are of little interest anyway.
I propose pruning old replay protection records by "expired epoch".
The expiration
field could be changed to expired_epoch_offset
to calculate the expired epoch when the transaction will expire. E.g. when the current epoch is 10 and expired_epoch_offset
is 5, the transaction will expire at the end of Epoch 15.
The key of a replay protection record could be {REPLAY_PROTECTION_ADDR}/{expired_epoch}/{tx_hash}
. When a new epoch starts, we can delete it with the prefix {REPLAY_PROTECTION_ADDR}/{new_epoch - 1}
.
So this solution would simplify the linear search for expired transactions, which would be very welcome, but I have a couple of concerns:
- The
expiration
field was set to be aDateTime
to give users maximum resolution (e.g. if an epoch lasts 1 day, someone might want a transaction to live less than that, like for a few hours or even minutes) - In any case I think that the expiration should always be an absolute value and not a relative one (at least in the transaction) because a malicious node could keep the transaction for itself and propagate it at a later time. Still, we could compute the offset once the transaction has been included in the block and we are about to write the hashes to storage, but in the tx itself we should make sure to keep the absolute value
- By placing the hashes under a key, the check for a replay attack won't be constant anymore since we'd need to iterate over the different
expired_epoch
subkeys and look in all of them if the hashes are present until we find them or we run out of subkeys
Instead of epochs
we could use the block height
in your suggestion: this should keep the pruning algorithm relatively fast (even if not constant time as with epochs) and grant us some more resolution but, for the last point here above, this solution would make the replay protection check even more expensive.
We might be safe if we set a small enough limit on the expiration of a tx, keeping the number of epoch or block_height buckets contained. But still, since the replay attack check is done on every transaction in every block, and the space optimization code would only run once per epoch, I'd prefer to keep the replay check as fast as possible at the cost of sacrificing a bit the speed of the pruning code.
What's the status of this issue?
We only implemented the extra unrelated step. We were speaking the other day about the fact that, at the moment, if a transaction doesn't carry an expiration, is valid indefinitely. We were thinking about adding (at least) a sane default in the client but we could also enforce something on the protocol side. This second option would open up the door for the optimization described in this issue
Hmm, how exactly would we enforce this on the protocol side? We'd have to require that transactions include a recent block hash, or something (so the protocol can know a date before they were created, and enforce that the expiration must be within some bound of that date), I expect. We could do this but it seems like a substantial change. I'd say that a default in the client is sufficient for now. wdyt?
I believe we can just use the expiration
field of the transaction. This is currently an Option
(and I believe it should remain so because protocol transaction don't expire). We can enforce that this value is set to Some(datetime)
and that this datetime does not exceed by a predefined value (could be a protocol parameter) the datetime of the last block produced (Comet time which is BFT). In the set of the hashes we can set the value (currently empty) to the expiration and periodically prune the expired transactions.
I believe for now we can just set a default in the client but I'd also like to wait for a response from the audit since they are looking at this issue too.
I believe we can just use the expiration field of the transaction. This is currently an Option (and I believe it should remain so because protocol transaction don't expire). We can enforce that this value is set to Some(datetime) and that this datetime does not exceed by a predefined value (could be a protocol parameter) the datetime of the last block produced (Comet time which is BFT). In the set of the hashes we can set the value (currently empty) to the expiration and periodically prune the expired transactions.
We can enforce this in the client, but I don't think the protocol can enforce it, because the transaction may be created and gossiped around (or stored somewhere) but not yet in a block for a long time.