solana icon indicating copy to clipboard operation
solana copied to clipboard

abs: Update last_full_snapshot_slot before calling clean_accounts()

Open brooksprumo opened this issue 2 years ago • 3 comments

Problem

Accounts Background Service tracks the last full snapshot slot, which is used when calling clean_accounts(). When handling a full snapshot request, the last_full_snapshot_slot value is updated after calling clean_accounts(). This means that zero-lamport accounts will be kept based on the previous full snapshot slot, instead of this current slot. By default, that's 25,000 slots-worth of zero-lamport accounts that are not getting cleaned up when they could be.

Edit to add: The next time clean_accounts() gets called (after a full snapshot, in the main ABS loop) will correctly clean up those zero-lamport accounts. This change allowed cleaning before taking the snapshot, so the snapshot archive should be a bit smaller if this was able to remove accounts and shrink append vecs.

Summary of Changes

Update last_full_snapshot_slot before calling clean_accounts().

brooksprumo avatar Sep 20 '22 13:09 brooksprumo

It seems like clean_accounts is written with the assumption that a snapshot already exists at last_full_snapshot_slot, rather than is being currently created.

Yah! One thing that is not necessarily obvious: the snapshot process is not allowed to fail. If it does, asserts/panics fire and the node will die. So I was thinking that proactively cleaning more zero-lamport accounts should be safe. Since the max_clean_root will be set to the snapshot bank's slot minus 1, I think this will enable cleaning zero lamport accounts [previous full snapshot slot, this full snapshot slot), where this slot is exclusive.

I'm trying to convince myself which way is actually right; having your insight is useful to make sure all corner cases are considered. Thanks for reviewing!

brooksprumo avatar Sep 20 '22 17:09 brooksprumo

do we have any metrics that track uncleaned zero lamport accounts?

HaoranYi avatar Sep 20 '22 17:09 HaoranYi

do we have any metrics that track uncleaned zero lamport accounts?

Maybe? Not that I'm aware of. That's probably a good thing to check before merging this.

brooksprumo avatar Sep 21 '22 00:09 brooksprumo

I'm going to run this against mnb for a while to ensure it's safe.

brooksprumo avatar Sep 26 '22 15:09 brooksprumo

I'm going to run this against mnb for a while to ensure it's safe.

I ran this branch for ~4 days without issue (until the mnb restart happened). I feel OK about its current state. Will monitor for any issues going forward.

brooksprumo avatar Oct 03 '22 13:10 brooksprumo