substrate Add API for block pinning

The block pinning is a feature required by the RPC Spec V2.

The block pinning feature should be called for subscribers of the chainHead_unstable_follow method.

Each block produced by the chainHead_unstable_follow (new block, best block, finalized block) subscription should be pinned to offer data access for:

block body (the old chain_getBlock RPC)
block header (the old chain_getHeader RPC)
storage
runtime call

The substrate has pin and unpin functionality but it is only preserving the state of the block.

Extend the database and client to offer support for pin_block and unpin_block to reference count the desired block and prevent it from pruning.

Oct 11 '22 17:10 lexnv

// CC @bkchr @skunert @jsdw @niklasad1

Is there anything special that we should do for storage and runtime calls at a pinned block, other than what was done in #12476 ?

Oct 11 '22 17:10 lexnv

Just to summarize from what @bkchr suggested in #12476:

While this is right, this is also right for everything else in the node. IMO it would probably be nice to have all blocks pinned that are still in some "notification" in the node. When the last notification for a block is dropped, the pinning could be stopped.

API pinning / unpinning

Substrate Client needs to expose an API for pinning / unpinning blocks.

A block that is pinned with this API has the following guarantees:

client/db code does not prune the block's body / header / justification here
client/state-db code does not prune block's storage here

The substrate client is pinning the blocks before they are reported here for any user of import_notification_stream and/or finality_notification_stream.

When the last notification BlockImportNotification or FinalityNotification is dropped, the block in unpinned.

Implications

Leaking Blocks in DB

If the substrate node crashes or is terminated, the block's entries from client/db side will remain in the database forever. To mitigate this:

Option 1. Mark blocks are pruned in the DB If the block is pinned and the prune_block is executed, we mark in the DB this block as pinned.

This would imply extending the meta_keys with a new key meta_key::PINNED_BLOCKS that works similar to meta_key::CHILDREN_PREFIX

On node start-up, the PINNED_BLOCKS is loaded from the DB and the given blocks are pruned.

Option 2. Prune blocks and keep the info in memory

This option is more simplistic in nature but has the downside that the node needs to keep the information in memory. We prune right away and serve the client from an in-memory cache.

Option 1 I believe is a promising approach for the long term, as the block's body could exponentially increase. However, given the time constraints of this, I believe we could implement Option 2 as a medium-term solution (possibly long-term if the block's body doesn't increase much).

Enforced Substrate Limits

This DB feature does not enforce any limit on the number of blocks kept around. Instead, it relies on the RPC layer to behave correctly and be subject to a sensitive limit.

@arkpar raised this in the following #12497:

I'd argue that if you let the clients pin the blocks as they please with no control of used memory, they are more likely to screw the node up and "break things".

Option 1. Limit the total number of pinned blocks With this approach, we ensure that no more than 1024 blocks are ever pinned while reporting the block's notification. When the limit is reached, the BlockImportNotification and FinalityNotification streams are dropped.
Option 2. Rely on the RPC layer to ensure proper limits Given the RPC layer provides an unpin method, this implies that the end-user has control over the node's memory. If there is bad behavior from the RPC layer, this might raise security concerns.

To err on the safe side, Option 1 seems like a good path forward. We enforce a limit big enough in DB side, while also enforcing a restrictive (few minutes of blocks) limit for the RPC layer.

Alternatives

Build upon the https://github.com/paritytech/substrate/pull/12497 "delayed-pruning" PR and expose a new notification stream: BlockPrunnedNotification, similar to BlockImportNotification and FinalityNotification.

The RPC layer can subscribe to BlockPrunnedNotification and if the RPC has the reported block as pinned, then the subscription is dropped immediately.

This has the benefit of avoiding the previous limitations, plus not needing to hold references to the notifications, but the 32 delayed blocks might need increasing if the blocks are produced at an interval higher than 6 seconds.

// CC @tomaka @bkchr @arkpar @skunert @jsdw @niklasad1 Would love to hear your thoughts on this 🙏

Nov 09 '22 16:11 lexnv

Another aspect is that we need to report the node's blocks from memory after the first initialized event. The initialized event contains client.info().finalized_hash.

From RPC V2 Spec:

Afterwards, generates one newBlock notification (see below) for each non-finalized block currently in the node's memory (including all forks), then a bestBlockChanged notification. The notifications must be sent in an ordered way such that the parent of each block either can be found in an earlier notification or is the current finalized block.

This is achieved at the moment without subscribing to BlockImportNotification or FinalityNotification. Instead, we are traversing all the children of the finalized block.

Nov 10 '22 09:11 lexnv

Just relinking here my comment about smoldot's API: https://github.com/paritytech/substrate/pull/12476#issuecomment-1307828819

Nov 10 '22 13:11 tomaka

First, thank you for creating this good write-up @lexnv!

Option 2. Prune blocks and keep the info in memory

This option is more simplistic in nature but has the downside that the node needs to keep the information in memory. We prune right away and serve the client from an in-memory cache.

Sounds like a good idea to me. If we need db support for this, we could still implement this later.

Option 2. Rely on the RPC layer to ensure proper limits Given the RPC layer provides an unpin method, this implies that the end-user has control over the node's memory. If there is bad behavior from the RPC layer, this might raise security concerns.

For sure we need a limit! I think we should go this way. In the node we should "trust" all logic to not pin blocks indefinitely and that everything has some reasonable upper bound. For now I would not try to put this upper bound into the pinning and more rely on the fact that every component on its own ensures that it doesn't keep all blocks pinned.

For the RPC layer/RPC server sitting in Substrate we should ensure that we have some kind of upper bound of pinned blocks per node and in total. These upper bounds should also be part of the spec or somehow be query able by applications to follow these upper bounds.

Nov 10 '22 14:11 bkchr

There are two kinds of blocks/states that may be pinned.

Finalized blocks. If not pinned, these blocks are discarded when they are out of the pruning history window. Pinning them simply means delaying pruning. I.e temporarily increasing pruning window. In fact, the hard limit on the pinned blocks and the size of the pruning history could be the same. The default pruning window is big enough anyway, so in most cases there won't be any overhead.
Blocks that are not yet finalized. If not pinned, these blocks either become finalized and move to the first category, or are discarded when a sibling branch is finalized. Pinning these is a bit complicated. E.g. when you pin a block you also effectively pin all its parents, and the whole branch can't be discarded. This simplest way to implement this is to delay discarding blocks in way similar to #12497. The hard limit would be the maximum number of block we can delay for.

Nov 10 '22 14:11 arkpar

substrate substrate copied to clipboard

Add API for block pinning

API pinning / unpinning

Implications

Leaking Blocks in DB

Enforced Substrate Limits

Alternatives

substrate
substrate copied to clipboard