namada icon indicating copy to clipboard operation
namada copied to clipboard

MASP storage optimization

Open grarco opened this issue 1 year ago • 4 comments

Currently the MASP storage is managed in a suboptimal way. More specifically, the following can be observed:

  1. All the keys are written under the subspace, meaning they are subject to diffs and merklization which is unneeded because we only need the most updated value of the masp keys and we already manage the masp merkle trees in a separate way
  2. We are currently writing the Transfer and Transaction objects (plus some other metadata) into storage to allow clients to scan the history of masp transactions and construct their internal states to correctly produce future transactions
  3. The conversion_state https://github.com/anoma/namada/blob/6c6b3bf56b689e3a47e62d2596152b87536a9fd8/core/src/ledger/masp_conversions.rs#L25-L38 is kept in memory under the Storage object

Issue 1 should be solved by #2265.

Issue 2 could be solved by not writing this data to storage since we already have it in the block storage. Clients could query the blocks instead (possibly with the help of an indexer), to get the same results. We should also make sure to keep the blocks around in case of a chain upgrade (see discussion at https://github.com/anoma/namada/pull/2248#discussion_r1430888110).

Issue 3 is probably not as major as the previous two. The ConversionState is effectively only queried by clients and updated in protocol once per epoch, so we could probably move it to storage to reduce the memory consumption (hopefully without increasing the query time too much because of the need to retrieve data from storage first)

grarco avatar Dec 28 '23 14:12 grarco

@cwgoes for reference

grarco avatar Dec 28 '23 14:12 grarco

Thanks for the writeup. I think we should definitely fix 1 & 2. Regarding 3, I'm not too worried about memory consumption, as long as we have a reasonable understanding of the bounds and total Namada memory usage does not exceed single-digit GBs. Do we have a sense of how large (concretely) this struct would be?

cwgoes avatar Jan 02 '24 02:01 cwgoes

I've tried to run some approximate calculations. So the ConversionState struct is composed of four fields:

  • normed_inflation, which is an Option<u128>, is constant at ~ 17 bytes
  • tree, which is a FrozenCommtimentTree that we update once per epoch
  • tokens, which is a map from a token alias to its address should remain low in size (BTreeMap<String, Address>)
  • assets, which is a map between AssetType and some data about it, will actually grow once per epoch, more specifically we'll need two inserts per epoch per every unique (token, denom) couple, one for the old asset and one for the new asset, but the one for the old asset is actually an overwrite of the previous value, so we can count just one insert

Here are some estimations on the size of the types involved (these are based only on the types, I'm not taking into account optimizations/paddings that the compiler might perform, e.g. to align types in memory):

Alias ~ 3 UTF8 chars, (suppose ASCII chars, so one byte per char), 3 bytes, ignore the size of the String smart pointer
AssetType ~ 33 bytes
Address ~ 20 bytes
MaspDenom = 1 byte (we have 4 denominations)
Epoch = 8 bytes
AllowedConversions = I28Sum + jujub::ExtendedPoint
	I128Sum = BTreeMap<AssetType, u128> => but actually this map is always empty for us
	ExtendPoint = 5 * 4 * 8 bytes = 160 bytes
usize = 8 bytes on 64 bits machines

Node = 32 bytes
FrozenCommitmentTree = (Vec<Node>, usize)
conv_notes = number of conversion notes, one for every value (so asset type) in the Map of assets in ConversionState
Capacity of Vec<Node> = 31 + 2 * conv_notes

So I believe the cost in memory of the conversion state could be divided into two parts: a constant (or semi-constant) part and a variable part that gets updated with every epoch change.

The constant part would be the cost of the normed_inflation plus the cost of the tokens map, as a function of the number of token addresses $t$:

$M_c(t) = 17 + t * (3 + 20) = 37 + 3t = C + 3t$

The epoch-variable part (at a given epoch $e$) is composed of the merkle tree and the assets fields. Assuming no new tokens are added in the current epoch, the number of new asset types (where every AssetType is the concatenation of a token address, a masp denomination and the epoch) that will be inserted, based on the number of token addresses, is:

$n_a(t) = (t * 4) * (33 + 20 + 1 + 8 + 160 + 8)$

And the delta increase of the frozen tree compared to its value at the previous epoch will be:

$d_{tree}(n_a) = 32 * (31 + 2 * n_a)$

The variable part can then be expressed as:

$M_{v, e}(t)= d_{tree} + n_a = 32 * (31 + 2 * n_a) + n_a = 992 + 64n_a + n_a = 992 + 65n_a = 992 + 65 * (4t * 230) = 992 + 59800t$

Ignoring the constant parts, the increase in memory consumption given by the variable part would be approximately 59800 bytes per token address per epoch, so ~ 58kB

grarco avatar Jan 02 '24 16:01 grarco

Thanks! Excellent write-up. I think that those memory costs are fine for the foreseeable future.

cwgoes avatar Jan 02 '24 17:01 cwgoes

Closing as completed

grarco avatar Apr 11 '24 15:04 grarco