reth icon indicating copy to clipboard operation
reth copied to clipboard

Syncing Base archive nodes - MERKLE_STAGE_DEFAULT_CLEAN_THRESHOLD potentially too low

Open jamesstanleystewart opened this issue 7 months ago • 13 comments

Describe the bug

Over the last few days we've had issues with Base Archive nodes being unable to get back to head from snapshot. I've read a few threads similar. We're now running the new i7ie aws instances, which are fast enough to get to head, however I noticed one thing in my travels that may be impacting Base's sync and I thought it worthwhile to get an experts opinion.

During my investigation I found that the MerkleExecute stage has a constant config: MERKLE_STAGE_DEFAULT_CLEAN_THRESHOLD = 5000 https://github.com/paradigmxyz/reth/blob/main/crates/stages/stages/src/stages/merkle.rs#L43C11-L43C47

From what I can tell this causes Reth to rebuild the Merkle data if it's syncing more than 5000 blocks from head. In Base, this takes 2-3 hours on aws nvme ssd (MerkleExecute stage_progress=0.07% stage_eta=2h 44m 13s <- 6000 blocks from head), and on Base, 5000 blocks pass every ~2 hours (compared to 16 hours for ethereum). This causes nodes to get "stuck" just beyond the 5000 block threshold, and if they pass it, MerkleExecute finishes very quickly ( I just watched an i7ie take 13 minutes for 4000 blocks).

Image

I don't know enough about the Merkle data to know for sure if it makes sense to double (or make configurable) that threshold for Base, but one of you will :) Let me know if I've read the situation incorrectly, but I thought it might be useful for others that hit this issue.

Steps to reproduce

Sync a Base archive node from snapshot on an i3en aws instance

Node logs


Platform(s)

No response

Container Type

Kubernetes

What version/commit are you on?

op-reth:v1.2.0

What database version are you on?

Unsure

Which chain / network are you on?

Base Mainnet

What type of node are you running?

Archive (default)

What prune config do you use, if any?

No response

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

  • [x] I agree to follow the Code of Conduct

jamesstanleystewart avatar Apr 03 '25 01:04 jamesstanleystewart