fuel-core
fuel-core copied to clipboard
Determine file encoding for initial state
Current development on the regenesis feature has split off the initial state from the chain config. This buys us some flexibility to select an appropriate encoding for the initial state file.
Some points to consider:
- Storage -- how much overhead does the encoding have?
- Cursor -- how quickly can we continue decoding the file when resuming the regenesis?
- Performance -- both for decoding and encoding
- Compression -- encoding overhead can be reduced by compression. On the other hand, having a cursor to the last loaded element might get complicated. If compression is used ideally it shouldn't have system dependencies and be a well-known algorithm.
- Human readability -- keeping it human readable helps with manipulating the file, and developing tools to analyze it, testing.
- Library maturity -- general support for encoding/decoding and also rust-specific support.
- Streaming - can you decode/encode on the fly? What is the API like? Do you handle reading/saving the data?
An update:
Wrote a small utility to benchmark our use case.
Our expected data volume is enormous -- a 1TB rocksdb database at the moment of regenesis. Benchmarking for this much data became problematic so I resorted to a mixed approach of benchmarks and linear regression.
Testing was done in memory to remove the storage speed variable.
Some expectations were given for the contract state in that it might reach numbers of 100M entries, which meant that one ContractConfig
might take up 6.4GB of memory (or more depending on the decoding overhead) when loaded (64B per entry).
This made us uneasy with regards to keeping the state
and balances
collections inside ContractConfig
.
We agreed that ContractConfig
would need some flattening. We considered encoding the state and balances into separate files. We ultimately decided against it because of the overhead when matching the contract state and balances to the contract itself. This is manifested in the need for a 'foreign key' to be saved with each storage or balance entry. If we don't enforce an ordering while encoding then a columnar file layout would be needed so that we may locate all of the contract state and balances when parsing a single contract. We briefly considered using the rocksdb snapshot (or sled
or mysql
) as the target format itself but didn't pursue the idea further.
An alternative would be introducing a somewhat custom layout to fit our needs exactly. We're currently placing the following requirement on whatever data format is chosen:
- First all
CoinConfig
s are encoded, - Then all
MessageConfig
s are encoded, - Finally all
ContractConfig
s are encoded. After eachContractConfig
all of its state and balances should follow in any order before anotherContractConfig
is decoded.
The ContractConfig
was broken down into 3 structures:
- A
ContractState
representing one storage slot (a(Bytes32, Bytes32)
) - A
ContractBalance
representing the balance of the contract for a particularAssetId
(a(Bytes32, u64)
) -
ContractConfig
- the original fields minus the above.
An example:
COIN
... (coins)
COIN
MESSAGE
...(messages)
MESSAGE
CONTRACT#1
STATE
STATE
BALANCE
STATE
BALANCE
CONTRACT#2
...
...
Test data
Randomly generated Coins, Messages, and Contracts in the above-described ordering. e.g. 1k test entries = 333 coins, 333 messages, and 333 contracts (each with 10k state entries and 100 balance entries)
Other formats
Didn't consider any column-oriented formats (such as Parquet) since we don't benefit from it. Didn't consider any formats that need schema code generation (such as ProtoBuf, CaptnProto, etc) Didn't consider formats whose rust impl is poorly maintained
Compression
Each test is run with and without compression.
Used native Zstd with minimal compression
Json (serde_json)
Pros:
- Mature, popular
- Serde compatible
- Human readable
- Can be streamed (json lines) Cons:
- We don't need the schema, adds encoding overhead
Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:
Storage
400GB without compression, 255GB with compression.
Encoding performance
Around 15m for uncompressed json, 1h8m for compressed.
Decoding performance
30 minutes for uncompressed, 43 minutes for compressed json,
Bincode
Pros:
- Mature
- Serde compatible
- No schema
- Can be streamed (just encode elements one after another) Cons:
- Not human-readable
- Manual streaming -- Collections require size to be encoded before hand
Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:
Storage
Around 290GB uncompressed, 240GB compressed
Encoding performance
Around 13 minutes uncompressed, and around 46 minutes compressed.
Decoding performance
27 minutes uncompressed, around 37 compressed
Bson
Pros:
- Mature
- Serde compatible
- Can be streamed Cons:
- Has overhead for schema
- Interface doesn't allow for encoding into a writer -- extra allocations
- No unsigned types native support, have to be careful around
u64
and bigger types since they can't fit intoi64
supported by BSON. - Manual streaming -- collections can only be encoded in-memory.
Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:
Storage
Around 400GB uncompressed, around 250GB compressed
Encoding performance
Around 15m uncompressed, around 45m compressed
Decoding performance
33 minutes uncompressed, 46 minutes compressed.
Bench summary graph
Compression impact on cursor
Seeking a location on uncompressed data is fast. This is not the case for compressed data since you have to decompress even the data you don't care about. There are workarounds should we need them.
The naive approach takes around 13 minutes to decompress and seek to the end of 400GB of compressed data.
Current summary: We're going with bincode + optional compression. Starting work on implementing the reading/writing logic.
Tried one more format, Parquet. Even though we don't benefit from the columnar layout parquet has the advantage of encoding data in chunks (solving the cursor + encoding problem).
It is also not Serde compatible and has poor support for deriving the encoding/decoding code.
Also the column for a file should represent one entity (and not multiple like a rust enum might do). So that means 5 files: coins, messages, contracts, contract_state and contract_balance.
Measurements taken for 10k, 20k, ..., 100k entries. Using linear regression to predict usage for up to 1B entries:
Storage
Around 140 GB without compression, more (+5GB) with compression (at the lowest level Gz, just like all the other compressions used)
Encoding performance
Around 300s (uncompressed) and around 1250s compressed.
Decoding performance
480s (uncompressed) and around 720s (compressed).
All compared
Storage
Encoding performance
Decoding performance
It seems Parquet is a clear winner. Will proceed to use it instead of bincode for the regenesis.
Done during https://github.com/FuelLabs/fuel-core/pull/1474