lasp icon indicating copy to clipboard operation
lasp copied to clipboard

Syncing initial state or state after a crash where state is lost

Open cmeiklejohn opened this issue 6 years ago • 0 comments

Migrating from lasp-lang/lasp_pg#13.

Anti-entropy isn't triggered immediately when a new node joins the cluster when using the state-based propagation backend. Therefore, it may take time before a node sees updates from other nodes in the cluster.

Reproducer:

  • Server 1 starts up
  • Server 1 adds Process 1 to a lasp_pg group
  • Server 2 starts up
  • Server 2 joins as peer

The issue becomes more problematic when dealing with a new or failed and recovering node with the delta-based propagation backend. Consider the following example:

  • Server 1 starts up
  • Server 2 joins
  • Server 1 updates
  • Buffers, sends deltas to server 2
  • Server 2 acknowledges deltas
  • Server 2 shuts down, crash failure (or, rejoins with a new identifier)
  • Server 2 will receive no changes until the next change to that same data item -- nothing has been buffered for that node, nor if something was because it recovered with no disk, the buffer will be empty

cc: @russelldb @tsloughter @vitorenesduarte

cmeiklejohn avatar Apr 11 '18 09:04 cmeiklejohn