solana clean_dead_slots_from_accounts_index unrefs correctly (backport #27461) (backport #27467)

This is an automatic backport of pull request #27467 done by Mergify. Cherry-pick of 7650bd2ad6feb0f7acca16b9a0da9a7bd9cf87f9 has failed:

On branch mergify/bp/v1.10/pr-27467
Your branch is up to date with 'origin/v1.10'.

You are currently cherry-picking commit 7650bd2ad.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   runtime/src/accounts_db.rs

no changes added to commit (use "git add" and/or "git commit -a")

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/github/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

Mergify commands and options

More conditions and actions can be found in the documentation.

You can also trigger Mergify actions by commenting on this pull request:

@Mergifyio refresh will re-evaluate the rules
@Mergifyio rebase will rebase this PR on its base branch
@Mergifyio update will merge the base branch into this PR
@Mergifyio backport <destination> will backport this PR on <destination> branch

Additionally, on Mergify dashboard you can:

look at your merge queues
generate the Mergify configuration with the config editor.

Finally, you can contact us on https://mergify.com

Sep 07 '22 20:09 mergify[bot]

preparing backport to 1.10 just in case.

Sep 07 '22 20:09 jeffwashington

:scream: New commits were pushed while the automerge label was present.

Sep 07 '22 20:09 solana-grimes

preparing backport to 1.10 just in case.

@jeffwashington @brooksprumo @apfitzge - Out of curiosity, will/should this be going into v1.10 (and I guess v1.13 now too)? I saw Brooks make a rec to someone to pull this commit in manually and it looks like the v1.11 backport landed a while back

Sep 20 '22 20:09 steviez

will/should this be going into v1.10 (and I guess v1.13 now too)?

The current answer is "wait". There was another long standing under refcount bug as well that was recently fixed in master. To be careful with potential instability, it is safer to not fix this type of refcounting bug when it has been in a long time and the effects are unpleasant, but known and not fatal.

Sep 20 '22 20:09 jeffwashington

effects are unpleasant, but known and not fatal.

this bug is causing validators to unexpectedly segfault/crash and then crash loop due to systuner using insufficient vm.max_map_count, experienced again just today by Cogent Crypto.

In the absence of a fix nodes either crash every 2-4 weeks or need to be regularly restarted to avoid that. Since there is no EOL for 1.13 known yet perhaps a backport can now be reconsidered as a month has passed since the last comment on this PR?

Oct 23 '22 16:10 michaelh-laine

@jeffwashington are we comfortable with backporting this change to 1.13?

Oct 24 '22 22:10 t-nelson

this one is being tested now off 1.13: #28582 These validators are running it. I'll hopefully add some more validators as they free up. 8XWpJyTb3dFgEa1nVkTKoTY3ivg1CXN5Wr6149dXTpVP FwQpCUn3bmB27bfhmt9Cepeyn6UPBAXCYWCwkYj5qUd9 3Ex1LtnZa5vwyZgAWC7WXy2aXcn1V1P2bpvK9Z4ZWrHj

Oct 25 '22 14:10 jeffwashington

went in to 1.13 as #28582

Oct 28 '22 18:10 jeffwashington

solana solana copied to clipboard

clean_dead_slots_from_accounts_index unrefs correctly (backport #27461) (backport #27467)

solana
solana copied to clipboard