snarkVM icon indicating copy to clipboard operation
snarkVM copied to clipboard

[Perf] Facilitate the caching of the block tree

Open ljedrz opened this issue 1 month ago • 9 comments

This PR targets https://github.com/ProvableHQ/snarkVM/issues/3026.

I used pprof to analyze the loading of a full mainnet ledger, and the resulting flamegraph:

flamegraph

indicated that nearly 96% of the associated work is the creation of the block tree. I double-checked it, and indeed, around 90% of the loading time was spent on that operation.

These changes facilitate the caching of the block tree alongside the ledger, and automatically loading it on startup. A snarkOS counterpart PR will follow, where one of the methods introduced here will be automatically called on shutdown.

While I don't have the full mainnet ledger locally, I have verified that the one I do have loads more quickly when these changes are introduced.

ljedrz avatar Nov 18 '25 16:11 ljedrz

I ran the numbers on a full maninnet ledger after these changes, and it's a clear improvement:

before: 239.75s
after:   27.82s (down 88%)

Dumping the serialized block tree took 2.48s, and the resulting file was 965MiB.

For future reference, I'm also attaching the updated flamegraph of a ledger load (w/o rayon for increased clarity):

flamegraph_new

I'll analyze it to see if there's room for more improvement, but these would be follow-ups unrelated to the changes from this PR.

ljedrz avatar Nov 20 '25 08:11 ljedrz

Looks great to me! I left one comment about potentially loading an outdated cache, which I believe needs to addressed.

Additionally, I wonder if we could store the block tree on Ledger::drop and not have snarkOS counterpart. That would be cleaner and ensure it is only invoked during shutdown. snarkOS shutdown is a little messy as of now, though, so that might be not possible yet (hence https://github.com/ProvableHQ/snarkOS/pull/3915).

kaimast avatar Nov 20 '25 18:11 kaimast

PR looks great, will approve after Kai's comments are addressed

vicsn avatar Nov 21 '25 09:11 vicsn

I wonder if we could store the block tree on Ledger::drop and not have snarkOS counterpart.

Yeah, it did work just fine, and I guess there's no reason not to have it happen implicitly; I'll remove the snarkOS-side PR if this change is accepted.

In other words, review comments addressed.

ljedrz avatar Nov 21 '25 09:11 ljedrz

After some additional tests I noticed that the Ledger is not internally Arced, so the Drop impl needs to be adjusted; I should have it ready soon.

ljedrz avatar Nov 21 '25 11:11 ljedrz

This is the right direction, but I'm not always seeing the Drop being triggered; I need to run some more snarkOS tests.

ljedrz avatar Nov 21 '25 12:11 ljedrz

@kaimast ok, I have a temporary shutdown solution for snarkOS to ensure that the Drop gets triggered. I'll propose it, and it can later be superseded by your changes.

Edit: filed as https://github.com/ProvableHQ/snarkOS/pull/4021.

ljedrz avatar Nov 21 '25 13:11 ljedrz

I've introduced a very cheap (basically free) coherence check in Ledger::load_unchecked; as a bonus, we now call BlockStore::max_height - which traverses the entire reverse state root map (meaning it is somewhat expensive, even though it is optimized) - only once.

ljedrz avatar Nov 25 '25 15:11 ljedrz

I've tested both possible cases:

  1. when the block tree is above the storage, BlockStore::open now fails with Block {height} exists in tree but not in storage; perhaps you used a wrong block tree cache file?
  2. when the block tree is below the storage, Ledger::load_unchecked now fails with The stored height is different than the one in the block tree; please ensure that the cached block tree is valid or delete it

ljedrz avatar Nov 26 '25 09:11 ljedrz

@ljedrz can you merge in origin/staging?

vicsn avatar Dec 18 '25 16:12 vicsn

Merged staging and addressed the final suggestions.

ljedrz avatar Dec 19 '25 09:12 ljedrz