solana icon indicating copy to clipboard operation
solana copied to clipboard

Feature: Interleaved Snapshot Untar and Indexing - Stage 1

Open apfitzge opened this issue 1 year ago • 1 comments

Problem

Much of the work in rebuilding accounts_db from snapshot(s) can be done while we are untaring the snapshot.

Summary of Changes

Separating the "interleaved untar & indexing" into a few PR stages. In this stage, we move the process of rebuilding snapshot storages to be done as we untar the snapshot. This also sets up a skeleton SnapshotStorageRebuilder which we can move more indexing operations to in future PRs.

Fixes #

apfitzge avatar Jul 12 '22 18:07 apfitzge

Testing on MNB snaphot(s):

  • Full snapshot: snapshot-141271324-5S45gzue6kS5ffkeFrF23CX1tbdEGi8yUmucfeuhS52V.tar.zst
  • Incremental snapshot: incremental-snapshot-141271324-141290659-9KTPeHb81ZjnAYTbqn9bmv9kAyNJR6SESwMWbqxXrH4z.tar.zst
Full Snapshot Untar (s) Incremental Snapshot Untar (s) Rebuild Bank (s) Total (s)
master 138.1 10.5 174.4 323.0
branch 137.2 12.6 142.4 292.2

This shows around 10%. The rebuilder threads are mainly waiting on files from the untaring - I temporarily added a 10ms wait on each file received, and saw no difference in overall time. So we've got a lot of room to put more of the indexing work on these threads.

This is with a snapshot that has the snapshot file serialized first. If the snapshot file is not serialized first, we just queue up the storage files, and process them synchronously after the snapshot unpack (same process as before).

apfitzge avatar Jul 12 '22 18:07 apfitzge

Tested with multiple account directories, worked as expected.

apfitzge avatar Aug 19 '22 20:08 apfitzge

lgtm. I cannot reset my review.

jeffwashington avatar Aug 23 '22 18:08 jeffwashington