nethermind icon indicating copy to clipboard operation
nethermind copied to clipboard

Automatic history pruning

Open Marchhill opened this issue 1 year ago • 5 comments

Resolves #7943

Changes

  • Drop history outside of configurable window on fork choice

Types of changes

What types of changes does your code introduce?

  • [ ] Bugfix (a non-breaking change that fixes an issue)
  • [x] New feature (a non-breaking change that adds functionality)
  • [ ] Breaking change (a change that causes existing functionality not to work as expected)
  • [ ] Optimization
  • [ ] Refactoring
  • [ ] Documentation update
  • [ ] Build-related changes
  • [ ] Other: Description

Testing

Requires testing

  • [x] Yes
  • [ ] No

If yes, did you write tests?

  • [x] Yes
  • [ ] No

Documentation

Requires documentation update

  • [x] Yes
  • [ ] No

Add section on configuring History settings. Users can configure DropPreMerge and HistoryPruneEpochs as they chose.

Requires explanation in Release Notes

  • [x] Yes
  • [ ] No

Added support for automated removal of block history.

Marchhill avatar Jan 10 '25 13:01 Marchhill

Should we consider the state of block processing and synchronization? Should we prune only when are free from processing/production? Should we wait for blocks/receipts or even state sync before pruning? Should we consider the settings in block/receipt sync?

Good questions, I would appreciate more input from people who are more familiar with the sync process than I am. My thinking behind not checking if we are syncing:

  • We only prune blocks older than 82125 epochs, outside the weak subjectivity period so it shouldn't affect anything
  • If for some reason it does, then we may want to prune while syncing. If we didn't do this the size of the history DB could get very large during sync, before eventually being pruned. This loses the benefits of history expiry, as we will still need disk space to store history while syncing

You also mention block production, do you think we shouldn't prune while producing a block? Atm I can't see a problem with doing that

Marchhill avatar Mar 26 '25 11:03 Marchhill

We only prune blocks older than 82125 epochs, outside the weak subjectivity period so it shouldn't affect anything

We sync old blocks, so I suppose such pruning will prune lots of them

  • If for some reason it does, then we may want to prune while syncing. If we didn't do this the size of the history DB could get very large during sync, before eventually being pruned. This loses the benefits of history expiry, as we will still need disk space to store history while syncing

Yes, so what if we change sync to consider pruning border and do not sync old blocks at all? Then after syncing we turn on pruning and prune what became outdated(maybe a dozen of blocks or so)

You also mention block production, do you think we shouldn't prune while producing a block? Atm I can't see a problem with doing that

Pruning is a secondary task that requires resources, seems like a good candidate for execution when ProcessingQueueEmpty. Purely optional for this request

flcl42 avatar Mar 27 '25 13:03 flcl42

there should definitely be changes to the syncer. why sync blocks that i will be pruning away later?! just stop syncing bodies and receipts when reaching the boundry

smartprogrammer93 avatar Mar 27 '25 19:03 smartprogrammer93

For the sync part of the code, the easiest way to integrate it is with some kind of interface IBlockPersistenceStrategy with a ShouldPersistBlock(BlockInfo). BodiesSyncFeed and ReceiptsSyncFeed has a SyncStatusList with a TryGetInfosForBatch method that accept a function that you can put it to specify the logic if it should download the block or not. You can also update the SyncConfigBarrierCalc to have the progress log make more sense. Because the head, suggestedheader, bestnumbers are all confusing and painful to think about, I suggest just use _blockTree.SyncPivot as the assumed head. It make it much easier to reason about.

asdacap avatar Apr 11 '25 00:04 asdacap

Actually, peers may not serve block if not done accurately. So the assumed head probably should be BestSuggestedHeader

asdacap avatar Apr 11 '25 00:04 asdacap

Need to trigger Eth69ProtocolHandler.NotifyOfNewRange when earliest available block changes. Either as part of this or a follow-up PR.

alexb5dh avatar Jul 31 '25 00:07 alexb5dh