substrate icon indicating copy to clipboard operation
substrate copied to clipboard

Sync 2.0

Open arkpar opened this issue 3 years ago • 4 comments

This is a tracking for the new implementation of syncing protocol and algorithm in substrate. It mostly concerns full/fast sync, but not warp sync.

Current implementation issues.

  1. Reliance on the longest chain fork rule.
  2. No clear separation between initial and keep-up sync.
  3. Does not consider finality in full sync mode.
  4. Complicated common ancestry search.

Proposed new protocol.

On connection peers exchange their best finalized block and all leaf blocks that are not yet finalized or discarded. On import of each block, node sends out an announcement to all connected peers. Announcement includes newly imported block information along with the best finalized block for a peer.

Proposed syncing algorithm.

For each connected peer we maintain a set of leaves that they have and common finalized number. These are updated on each announcement.

There are two major syncing modes:

Initial sync. In this mode the node syncs a single finalized chain. It is triggered on startup or when majority of peers start announcing blocks with unknown parents (i.e. after an offline period). Since this is a canonical chain, series of blocks may be requested by number allowing for parallel block download from multiple peers. There's no need to do ancestry search as we can simply assume that the common block is the one with a min number in a finalized chain.

Keep-up sync. Node switches to this mode when it finalizes a recent block. In this mode sync simply requests chains that peers have, starting from each leaf and back to the best finalized block (or existing parent block).

arkpar avatar Jan 26 '22 23:01 arkpar

cc @bkchr

arkpar avatar Jan 26 '22 23:01 arkpar

Is this supposed to work with chain that doesn't have finality? I'm worried about growing number of places where finality is implicitly assumed to exist in Substrate, making it less generic.

Also I'm interested in clearer separation between initial and keep-up sync here.

nazar-pc avatar Aug 09 '22 07:08 nazar-pc

It may not be finality in the same sense, as the the actual finality engine used. Substrate indeed already has a limit on maximum possible re-org. So in the simplest case sync may consider "final" any block past that threshold. But yes, this strategy uses the notion of finality to make thing simpler. We may make the sync strategy pluggable to support other cases though.

arkpar avatar Aug 09 '22 19:08 arkpar

We must support chains without their own finality, since this describes parachains. We do not afaik need chains without any notion of finality, because afaik they're not really secure anyways.

A priori, I think this proposal makes sync easier to understand, which maybe helps us do async time thing by Handan (and Peter Czaban) and do my slashing reform idea. I've not really thought about the details though.. At least slashing reform cares most about sync ahead of finality, which is not really the topic here I guess, so maybe just orthogonal, not sure about async time.

burdges avatar Aug 10 '22 21:08 burdges

We do not afaik need chains without any notion of finality, because afaik they're not really secure anyways.

Do you suggest for Substrate to not support PoW and similar consensus mechanisms that don't have deterministic finality at all?

nazar-pc avatar Aug 12 '22 12:08 nazar-pc

Yes exactly..

Proof-of-work is not secure. It's especially bad on smaller chains ala https://www.crypto51.app/ but bitcoin shall be double spent eventually too ala https://economics.princeton.edu/working-papers/on-the-instability-of-bitcoin-without-the-block-reward/

We'll never fully support proof-of-work parachains anyways. We need pre-collation proofs in too many places, like parachain block submission, XCMP message transport, etc, so attackers could halt proof-of-work parachains fairly easily. And proof-of-work parachains would de facto pay 10x more for their parachain slot, assuming they miss like 9/10 relay chain slots.

Intuitively, if you've a properly engineered protocol then you've somewhat tight tolerances everywhere, so proof-of-work anywhere typically enables DoS attacks upon your larger protocol.

burdges avatar Aug 13 '22 01:08 burdges