nethermind icon indicating copy to clipboard operation
nethermind copied to clipboard

Move pruning to background thread to reduce latency

Open LukaszRozmej opened this issue 3 years ago • 1 comments

Changes:

Currently pruning blocks block processing, which can lead to increased latency from time to time on block. This moves pruning to background thread.

This is synchronized with block processing by locking on _dirtyNodes, so either block can be processed or a pruning loop can spin once. This allows for thread safety to access trie nodes while not blocking block processing. This also allows reducing latency on heavy pruning with multiple spins of loop by allowing block processing to go in-between loop spins (at the cost of short-term higher memory usage?).

Note: First versions weren't using any synchronization, but this sometimes had issues and threw exceptions.

At the disposal, there is the cancellation of pruning, which waits for the current loop spin of pruning to finish and stops it from further spins.

Types of changes

What types of changes does your code introduce? Put an x in the boxes that apply

  • [x] Bugfix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [ ] Documentation Update
  • [ ] Code style update (formatting, renaming)
  • [x] Refactoring (no functional changes, no api changes)
  • [ ] Build related changes
  • [ ] Other (please describe):

Testing

Requires testing

  • [x] Yes
  • [ ] No

In case you checked yes, did you write tests??

  • [ ] Yes
  • [x] No

Comments about testing , should you have some (optional) Test: Very big pruning cache (like 8-16GB), FullSync+pruning, maybe heavy JsonRpc load hitting the state (eth_call or tracing?), Validate we don't have regressions in snap sync

LukaszRozmej avatar Sep 23 '22 13:09 LukaszRozmej

LGTM. The locking is simple and solves any problem that could arise.

smartprogrammer93 avatar Sep 24 '22 08:09 smartprogrammer93

Starting tests from this branch

kamilchodola avatar Oct 05 '22 11:10 kamilchodola

Things to verify: Processing queue wasn't empty added to queue a new payload: 15686051 on mainnet-teku FullSync issue

kamilchodola avatar Oct 06 '22 10:10 kamilchodola