ethereum-consensus icon indicating copy to clipboard operation
ethereum-consensus copied to clipboard

Mainnet Blocks for examples - Data Hosting Strategy Needed

Open EchoAlice opened this issue 11 months ago • 1 comments

Previously, features implemented within ethereum-consensus were tested against the python specs via just test. However, upon investigation (ongoing), i haven't found any tests that leverage validator sets comprable to the sizes seen on mainnet.

This is particularly problematic for testing attestation processing optimizations: large validator sets dramatically increase the time it takes to run unoptimized attestation processing at epoch boundaries.

To address the problem i decided to download mainnet blocks, where validator set sizes are much larger, and use those within tests. i used Git LFS to host data storage because block files were too large to post to the repo. Since Git LFS has bandwidth/storage limitations, i created a private version of the forked repository to leverage Git LFS.

If you find hosting mainnet block data is of value, i'd appreciate discussion on preferred approaches.

EchoAlice avatar Dec 02 '24 21:12 EchoAlice

I think having mainnet data for testing/benchmarking/analysis would be helpful!

A few paths forward:

  • include (pre-state, epoch block, post-state) as binary ssz data in this repo
  • include the states and a span of blocks, and the question is how many? can we get away with one epoch?

do you have a sense for how much data this is? it is easier to work with smaller repos, but I think even one epoch of the relevant data may be a reasonable size

ralexstokes avatar Dec 17 '24 23:12 ralexstokes