reth
reth copied to clipboard
feat(pruning): fair pruning
Closes https://github.com/paradigmxyz/reth/issues/7343, related to pruner interruption ref https://github.com/paradigmxyz/reth/issues/6770.
- Models exhaustive list of prunable tables as a ring.
- Implements a segment iterator, that generates a cycle of segments, wrt to given start table.
- Saves last pruned table between prune jobs. This ensures fair pruning, as the next job can pick up where the last one left off.
still have to build some tests
It feels a bit overcomplicated, can we just have a VecDeque
of segments that we pop/push from/to?
How I see it:
-
VecDeque
of segments initialized when the pruner is initialized - When we prune, we pop segments from the
VecDeque
one by one until there's none left - When a limit (with timeout or items deleted) is hit, push segments that we ran to the end of the
VecDeque
- On the next run, pop will return segments that we didn't run first
WDYT?
Saves last pruned table between prune jobs
I am not sure if we need it, what is the case when we want to continue pruning some table inside a segment, and not the whole segment from the beginning?
It feels a bit overcomplicated, can we just have a
VecDeque
of segments that we pop/push from/to?How I see it:
VecDeque
of segments initialized when the pruner is initialized- When we prune, we pop segments from the
VecDeque
one by one until there's none left- When a limit (with timeout or items deleted) is hit, push segments that we ran to the end of the
VecDeque
- On the next run, pop will return segments that we didn't run first
WDYT?
No need to reallocate memory, easiest is to just save the index we would have pruned next in the Vec<Box<dyn Segment>>
.
True that there is no need to generate the segments, other than for static files, on each call to prune_segments
. On second look, I saw that PruneMode::Before
is not used in the static allocation of Vec<Box<dyn Segment>>
which is built using PruneMode
s.
Saves last pruned table between prune jobs
I am not sure if we need it, what is the case when we want to continue pruning some table inside a segment, and not the whole segment from the beginning?
checkpoints are saved when pruning stops
No need to reallocate memory, easiest is to just save the index we would have pruned next in the
Vec<Box<dyn Segment>>
.
up to you, I'd prefer a VecDeque
for a more intuitive API. Since all segments are Box
, it's not a big overhead. Also, VecDeque
doesn't have a requirement for the elements to be contiguous in memory.
"Since VecDeque is a ring buffer, its elements are not necessarily contiguous in memory." – from https://doc.rust-lang.org/std/collections/struct.VecDeque.html
don't think it makes that big difference now that this is implemented + tested
blocked by db background task design, cc @Rjected