rust-lightning icon indicating copy to clipboard operation
rust-lightning copied to clipboard

Optimize ChannelMonitor persistence on block connections.

Open G8XSU opened this issue 1 year ago • 2 comments

Currently, every block connection triggers the persistence of all ChannelMonitors with an updated best_block. This approach poses challenges for large node operators managing thousands of channels. Furthermore, it leads to a thundering herd problem (https://en.wikipedia.org/wiki/Thundering_herd_problem), overwhelming the storage with simultaneous requests.

To address this issue, we now persist ChannelMonitors at a regular cadence, spreading their persistence across blocks to mitigate spikes in write operations.

Tasks:

  • [ ] Don't pause events for chainsync persistence #2957 and base it on that.
  • [ ] Concept/Approach Ack
  • [ ] Decide a good default for partition_factor
  • [ ] Maybe we can make partition_factor user-configurable. (Can also do this later, depends on our default value)
  • [ ] Write more tests for persistence with partition_factor.

Closes #2647

G8XSU avatar Mar 25 '24 13:03 G8XSU

Is this something we might want to consider not doing on mobile? Thinking that we won't be able to RBF onchain claims properly if the fee estimator is broken and we're not persisting the most recent feerate we tried within the OnchainTxHandler.

wpaulino avatar Apr 08 '24 16:04 wpaulino

I guess we should/could consider always persisting if there's pending claims (eg channel has been closed but has balances to claim)? Alternatively, we could always persist if we only have < 5 channels.

TheBlueMatt avatar Apr 11 '24 13:04 TheBlueMatt

What's the status here @G8XSU?

TheBlueMatt avatar Jun 03 '24 14:06 TheBlueMatt

Yes makes sense, we can always persist if there are pending claims.

I am looking for a concept/approach ack here before I proceed with rest of the changes. Are we in the right direction about how to distribute?

G8XSU avatar Jun 04 '24 20:06 G8XSU

I think wpaulino raised a good point and we should do something to ensure we regularly persist monitors on mobile (like what I suggested above), but otherwise concept ACK.

TheBlueMatt avatar Jun 06 '24 19:06 TheBlueMatt

Marking this PR ready for review.

G8XSU avatar Jun 17 '24 20:06 G8XSU

Squashed Fixup commit.

G8XSU avatar Jun 19 '24 07:06 G8XSU