besu Reduce receipt size

trafficstars

Thanks for sending a pull request! Have you done the following?

[x] Checked out our contribution guidelines?
[x] Considered documentation and added the doc-change-required label to this PR if updates are required.
[x] Considered the changelog and included an update if required.
[x] For database changes (e.g. KeyValueSegmentIdentifier) considered compatibility and performed forwards and backwards compatibility tests

Most advanced CI tests are deferred until PR approval, but you could:

[x] locally run all unit tests via: ./gradlew build
[ ] locally run all acceptance tests via: ./gradlew acceptanceTest
[ ] locally run all integration tests via: ./gradlew integrationTest
[ ] locally run all reference tests via: ./gradlew ethereum:referenceTests:referenceTests

PR description

Reduce receipt by introducing a new compact receipt encoding saving approximately 78GB on a freshly synced mainnet snapsync node. This new compact receipt encoding does not include the bloom filter and trims zeros from the log topics and log data similar to what was done in the Geth and Nethermind receipt encoding PRs.

I've added a version for the database format. So users won't be able to downgrade after once this is merged.

Create new compact receipt encoding with no bloom filter and reduce logs by trimming leading zeros on the log topic and log data
New format is detected by the absence of the log bloom filter encoded as a null/zero in RLP
Besu supports both the existing receipt and the compacted receipt format
New format enabled by default
Added feature flag to disable the feature where performance RPCs is critical --receipt-compaction-enabled
Added new Database metadata format v3 for forest and bonsai to avoid runtime errors if a user downgrade Besu to a version that doesn't have support for receipt compact

Testing

Snap sync on mainnet
Checkpoint sync on mainnet
RPC results comparison against version without this change for eth_debugGetRawReceipts, eth_feeHistory, eth_getBlockReceipts, eth_getMinerDataByBlockHash, eth_getTransactionReceipt, eth_getLogs, eth_newFilter, eth_getFilterChanges, eth_getFilterLogs
RPC load test eth_getLogs and eth_getBlockReceipts with 2 req/s over 5 minutes

Engine API performance

Under normal load see similar performance to the nodes not using receipt compaction Screenshot 2024-02-29 at 5 30 39 pm

Load testing

eth_getLogs with 2 req/s over 5 minutes using Gatling on the node while it syncing

receipts min: 2, max: 147, mean: 12, std dev: 11, response 95th percentile: 26, response 99th percentile: 53 min: 3, max: 107, mean: 12, std dev: 8, response 95th percentile: 24, response 99th percentile: 43 min: 3, max: 73, mean: 12, std dev: 8, response 95th percentile: 23, response 99th percentile: 52

control: min: 2, max: 140, mean: 16, std dev: 14, response 95th percentile: 41, response 99th percentile: 58 min: 2, max: 73, mean: 10, std dev: 6, response 95th percentile: 21, response 99th percentile: 32 min: 3, max: 118, mean: 11, std dev: 10, response 95th percentile: 20, response 99th percentile: 55

Storage improvement

control:

Column Family	Keys	Total Size	SST Files Size	Blob Files Size
BLOCKCHAIN	2372566258	948 GiB	167 GiB	781 GiB

receipts:

Column Family	Keys	Total Size	SST Files Size	Blob Files Size
BLOCKCHAIN	2372566326	870 GiB	167 GiB	702 GiB

Disk space saving of ~78GB

Fixed Issue(s)

fixes #6476

Feb 22 '24 02:02 jframe

I like the results of this PR, looking promising! Had a very quick first pass and the first thing I saw was some tests defaulting the creation of a "PrefixedKeyBlockchainStorage" to "false". It makes sense to test both scenarios where possible (I think you already do for unit tests). And In case where the creation of the blockchain isn't actually being tested but created to fulfill a dependency I would rather have that set to true since this is going to be the default value.

Mar 04 '24 08:03 gfukushima

Added new Database metadata format v3 for forest and bonsai to avoid runtime errors if a user downgrade Besu to a version that doesn't have support for receipt compact

Will the user get a startup error instead?

min: 2, max: 147, mean: 12, std dev: 11, response 95th percentile: 26, response 99th percentile: 53

Is this showing response time measured in ms?

What's your interpretation of the results - seems like receipts are better than control (if significant)?

Mar 05 '24 04:03 siladu

Added new Database metadata format v3 for forest and bonsai to avoid runtime errors if a user downgrade Besu to a version that doesn't have support for receipt compact

Will the user get a startup error instead?

They will get a startup error saying the database version is incompatible.

min: 2, max: 147, mean: 12, std dev: 11, response 95th percentile: 26, response 99th percentile: 53

Is this showing response time measured in ms?

What's your interpretation of the results - seems like receipts are better than control (if significant)?

Response time is measured in ms.

Not clear to me from the data that it's better as there is a fair bit of variance between test runs. The only thing I think can I say is that receipts seem at least as good as the control.

Mar 05 '24 05:03 jframe

@fab-10 Would you be able to take a look at the use of the database versioning I'm using in the PR?

Mar 07 '24 06:03 jframe

Have been thinking about the database versioning, wanted to discuss the following options:

Current state of PR: new format enabled/disabled in tandem with optional flag. Backwards compatible but not forwards compatible if a user downgraded.

Update the database version regardless of the feature flag, i.e. version 3 supports --receipt-compaction-enabled behaviour even when the flag is disabled. Effectively the same as (1) but versioning is coupled to the release rather than flag.

Avoid updating the database version for this: a justification could be that we introduce a convention where we only use the versioning for "schema" changes rather than changes to the data contents. Non-forwards-compatible downgrade issue remains though, once the feature has been enabled. I think you said the only benefit of the updated database version is ability to generate an appropriate warning message if they downgrade?

Orthogonal to the versioning...think the original plan was to enable the flag by default. Maybe this can be reconsidered now Dencun has passed. Do any exclusions apply to this (e.g. FULL sync?), or do we rely on announcement/breaking changes warning for users who require historic RPC to evaluate this feature?

Had a discussion with @siladu on these points and the outcome was to leave the PR as is using 1.

Using option 2 has the benefit of a single version and is more directly tied to the capabilities rather than the configuration of --receipt-compaction-enabled flag but if the version was updated regardless of the flag then users wouldn't be able to downgrade to the previous version of Besu.

By having this version conditional the backwards compatibility only affects users who have explicitly enabled this feature rather than everyone and give people plenty of time to be aware of the change. In a future release, the plan is to enable this by default and remove the conditional database version upgrade. At this point, we will have several versions of Besu that users can downgrade if there is an issue in the release.

Mar 18 '24 06:03 jframe

Doing another sync on mainnet to ensure the receipt changes still work after receipt changes

Mar 19 '24 01:03 jframe

besu besu copied to clipboard

Reduce receipt size

Thanks for sending a pull request! Have you done the following?

Most advanced CI tests are deferred until PR approval, but you could:

PR description

Testing

Engine API performance

Load testing

Storage improvement

Fixed Issue(s)

besu
besu copied to clipboard