cronos
cronos copied to clipboard
Problem: db size increase too fast
investigate to see if there are low hanging fruits to reduce the db size.
For reference:
939G application.db
42G blockstore.db
1.0G cs.wal
46M evidence.db
4.0K priv_validator_state.json
47M snapshots
81G state.db
238G tx_index.db
Remove tx_index.db
Currently we rely on tx indexer to query tx by eth tx hash, an alternative solution is to store that index in a standalone kv db in app side, so we don't need to retain all the tx indexes.
RocksDB uses snippy
as the default compression algorithm, We can use LZ4
or other more aggressive(but may more resource consuming) algorithms as its compression option.
ref:
https://github.com/facebook/rocksdb/wiki/Compression
https://github.com/tendermint/tm-db/blob/d24d5c7ee87a2e5da2678407dea3eee554277c83/rocksdb.go#L33
Remove tx_index.db
Currently we rely on tx indexer to query tx by eth tx hash, an alternative solution is to store that index in a standalone kv db in app side, so we don't need to retain all the tx indexes.
yap, we should consider using a new kvstore just for storing the tx hash mapping. Also we can disable the Tendermint indexer for increasing the consensus performance.
Remove tx_index.db
Currently we rely on tx indexer to query tx by eth tx hash, an alternative solution is to store that index in a standalone kv db in app side, so we don't need to retain all the tx indexes.
you mean nodes could choose not to have this tx_index.db by moving this part off-chain?
Remove tx_index.db
Currently we rely on tx indexer to query tx by eth tx hash, an alternative solution is to store that index in a standalone kv db in app side, so we don't need to retain all the tx indexes.
you mean nodes could choose not to have this tx_index.db by moving this part off-chain?
yes, by store the eth tx hash index in another place.
I will start a testing build with the custom RocksDB setup to see how good it can be improved
# IndexEvents defines the set of events in the form {eventType}.{attributeKey},
# which informs Tendermint what to index. If empty, all events will be indexed.
#
# Example:
# ["message.sender", "message.recipient"]
index-events = []
There's an option in app.toml
to fine turn what events to index.
The minimal one for json-rpc to work should be:
index-events = ["ethereum_tx.ethereumTxHash", "ethereum_tx.txIndex"]
EDIT: ethereum_tx.txIndex
is necessary too.
For reference:
939G application.db 42G blockstore.db 1.0G cs.wal 46M evidence.db 4.0K priv_validator_state.json 47M snapshots 81G state.db 238G tx_index.db
Which block height was observed in this DB scale?
Looks like lz4
might be working, the currentapplication.db
of the testing node at the block height of 1730K is around 511G. Projecting today's block height(2692K) will be around 755G. And at the same time, the application.db
of the full node with rocksDB using snippy
is 1057G. roughly 25% space-saving.
Wait until the testing node fully syncs up to the network and see the final result.
@tomtau mentioned we could do some statistics on the applications.db
to see what kinds of data occupy most space, then see if there's any waste can be saved in the corresponding modules. For example, iterate the iavl tree, and sum the value lengths of each module prefixes.
BTW, this is prunning=default
node size (thanks @allthatjazzleo):
535G /chain/.cronosd/data/application.db
20K /chain/.cronosd/data/snapshots
44G /chain/.cronosd/data/blockstore.db
120G /chain/.cronosd/data/state.db
312G /chain/.cronosd/data/tx_index.db
20K /chain/.cronosd/data/evidence.db
1023M /chain/.cronosd/data/cs.wal
1011G /chain/.cronosd/data/
Compared to full archive one:
1.1T /chain/.cronosd/data/application.db
79M /chain/.cronosd/data/snapshots
47G /chain/.cronosd/data/blockstore.db
90G /chain/.cronosd/data/state.db
260G /chain/.cronosd/data/tx_index.db
78M /chain/.cronosd/data/evidence.db
1.1G /chain/.cronosd/data/cs.wal
1.5T /chain/.cronosd/data/
BTW, this is
prunning=default
node size (thanks @allthatjazzleo):535G /chain/.cronosd/data/application.db 20K /chain/.cronosd/data/snapshots 44G /chain/.cronosd/data/blockstore.db 120G /chain/.cronosd/data/state.db 312G /chain/.cronosd/data/tx_index.db 20K /chain/.cronosd/data/evidence.db 1023M /chain/.cronosd/data/cs.wal 1011G /chain/.cronosd/data/
Compared to full archive one:
1.1T /chain/.cronosd/data/application.db 79M /chain/.cronosd/data/snapshots 47G /chain/.cronosd/data/blockstore.db 90G /chain/.cronosd/data/state.db 260G /chain/.cronosd/data/tx_index.db 78M /chain/.cronosd/data/evidence.db 1.1G /chain/.cronosd/data/cs.wal 1.5T /chain/.cronosd/data/
the pruning=default
only keeps last 100 states, so it will be good for running the node without query functions.
Got the testing node synced up to the plan upgrade height, using default:
1057776M ./application.db
45714M ./blockstore.db
88630M ./state.db
using lz4
1058545M ./application.db
47363M ./blockstore.db
88633M ./state.db
It meets the benchmark in this article. There is no gain from the compression ratio, only gains from the compression/decompression speed
https://morotti.github.io/lzbench-web/?dataset=canterbury/alice29.txt&machine=desktop
why is state.db larger in the pruned one? (120GB vs 90GB)
Went through the application.db
, Got some basic statistic numbers (at height 2933002 and the size is raw data length):
evm and ibc module use major store space in the database which is not surprised, will look at more details in these modules
evm ~24.6M kv pairs, keySizeTotal: ~1.3G, valueSizeTotal: ~976M, avg key size:52, avg value size:39 ibc ~2.6M kv pairs, keySizeTotal: ~149M, valueSizeTotal: ~58M, avg key size:57, avg value size:22
Another thing related is, in v0.6.x
we had a minor issue that contract suicide don't really delete the code and storage, not sure how much impact does that have on the db size though.
it feels that ibc shouldn't store so many pairs, can you see the prefixes?
the major Key patterns in ibc store:
acks/ports/transfer/channels/channel-0/sequences/...
counts 1003777
receipts/ports/transfer/channels/channel-0/sequences/...
counts 1003777
clients/07-tendermint-1/consensusStates/...
counts 403893
636C69656E74732F30372D74656E6465726D696E742D31...
(hex code of clients/07-tendermint-1) counts 134631
I guess some historical (i.e. older than "evidence age") states, acks, receipts... could be pruned from ibc application storage? Do you have a more detailed breakdown of evm?
https://github.com/cosmos/ibc-go/blob/release/v2.2.x/modules/light-clients/07-tendermint/types/update.go#L137
for the consensusStates
, there's a pruning logic, but it only deletes at most one item at a time. we might need to check how many expired ones are currently.
the sequence keys don't seem prune at all.
Do you have a more detailed breakdown of evm?
working on it,
The evmstore stores:
1: code, the key will be the prefix 01
+ codehash (this part should be fine)
2: storage, the key will be the prefix 02
+ eth account address + hash of something (trying to figure out)
The EVM module's storage schema is much simpler, contract code and storage, and the storage slots are calculated by evm internally, I guess there's not much to prune there.
2: storage, the key will be the prefix 02 + eth account address + hash of something (trying to figure out)
it's the storage slot number, computed by evm internal.
2: storage, the key will be the prefix 02 + eth account address + hash of something (trying to figure out)
it's the storage slot number, computed by evm internal.
in the storage part, the address 1359135B1C9EB7393F75271E9A2B72FC0D055B2E
has 382381 kv pairs, so does it store that much slots?
https://cronos.org/explorer/address/0x1359135B1C9Eb7393f75271E9a2b72fc0d055B2E/transactions
2: storage, the key will be the prefix 02 + eth account address + hash of something (trying to figure out)
it's the storage slot number, computed by evm internal.
in the storage part, the address
1359135B1C9EB7393F75271E9A2B72FC0D055B2E
has 382381 kv pairs, so does it store that much slots? https://cronos.org/explorer/address/0x1359135B1C9Eb7393f75271E9a2b72fc0d055B2E/transactions
to verify that, we need to have the source code, the solidity compiler can output a storage layout file which is helpful to verify the slots.
I just had an idea to trade some speed for disk space: currently, the storage format is like this:
02 + address{20} + slot1{32} -> value1{32}
02 + address{20} + slot2{32} -> value2{32}
...
Alternatively:
02 + address{20} + slotHighBits{20} -> {slotLowBits{12} -> value, ...}
It groups at most 4096
values into one KV pair, I guess it helps to reduce redundancy in the keys and intermediate overhead in the iavl tree.
It works best for the continuous storage regions in solidity contract, not so well for maps.
go-ethereum stores each contract's state in an independent trie, so their structure is like this:
accounts-trie:
contract address -> rootHash
rootHash:
slot1 -> value1
slot2 -> value2
Much less redundancy there.
Another thing related is, in
v0.6.x
we had a minor issue that contract suicide don't really delete the code and storage, not sure how much impact does that have on the db size though.
Does our indexer know how many (and which) contracts had been suicided? from the datastore cannot see which contract had been suicided
https://github.com/cosmos/ibc-go/blob/release/v2.2.x/modules/light-clients/07-tendermint/types/update.go#L137 for the
consensusStates
, there's a pruning logic, but it only deletes at most one item at a time. we might need to check how many expired ones are currently.the sequence keys don't seem prune at all.
the active consensusStates is 102 (block height around 3M) and seems like about the same amount at the height 2M. Maybe we can prune a lot of kv pairs?
yes, the latest sequence # in the test data is 1038570
(and I got the counts 1003777, not sure why there are some sequence# missing), Can we prune these values if we don't need those data anymore? each key-value size is around 80bytes