Fix `runtime.history.History.Watchblocks` to emit sequential blocks (no gaps)
Subscription to runtime.client.api.WatchBlocks (see) may return block sequence with missing rounds.
This may happen because runtime.history.History is not notifying blocks if they are Committ-ed via notify=false (during reindex)**.
Note this is only problematic if node !hasLocalStorage, as otherwise storage syncing (worker.storage.committee.Node) is responsible for notifying blocks via StorageSyncCheckpoint. This worker subscribes to roothash.ServiceClient.WatchBlocks, which may have gaps, however it fetches missing rounds.
As discussed in private, roothash.ServiceClient should probably not be responsible for reindexing, nor should roothash.BlockHistory know anything about storage, node types etc... Likely this code needs a thorough refactor.
** We are guaranteed to have at least two reindexes, possibly more if there are errors during subsequent event processing.
As discussed in private, roothash.ServiceClient should probably not be responsible for reindexing, nor should roothash.BlockHistory know anything about storage, node types etc... Likely this code needs a thorough refactor.
Idealy, we move reindexing part from roothash.ServiceClient to something like worker.runtime.xxx.
The worker should be initialized, just like others, after we have initialized RuntimeRegistry. Worker should first fetch runtime.history.History.LastConsensusHeight and reindex until n blocks away from consensus sync. At this point, it should subscribe to roothash.WatchBlocks, do another reindex to fill possible gap and then proceed with regular processing. After that, blocks should be always notified without any gaps.
What about problematic runtime.history.History?
-
runtime.history.Historydoes two things:- it implements
roothash.BlockHistory, with pruning policy - it exposes methods that take storage sync into account.
- it implements
- We could fix this by either:
- removing
runtime.history.Historyinterface altogether and reference concrete implementation directly, - split existing block history abstraction into two,
- moving methods that take storage sync into account to the new worker.
- removing
- The Problem is that we implement
api.RuntimeClient.WatchBlocks, by subscribing toHistory.WatchBlocks.- Whichever solution we will have, we are still hiding away from clients of
api.RuntimeClient.WatchBlocks(7rn), that behavior depends on the underlying node storage type. - I prefer explicit parameters && documentation, over hidden flags/state, but then this would mean exposing node/storage type as part of our runtime client API :/?
- Whichever solution we will have, we are still hiding away from clients of
Alternatively (not ideal), we may push for smtg similar to #6079, or keep it even simpler and use Commit/(Batch) with notify=true, during subsequent reindexes.
I think it would be nice to refactor properly, but also depends on our priorities.
Idealy, we move reindexing part from
roothash.ServiceClientto something likeworker.runtime.xxx.
Agree. I tried to do that in #6089.