dip icon indicating copy to clipboard operation
dip copied to clipboard

LIP-1 Ability to resync channel state

Open udirom opened this issue 5 years ago • 4 comments

What happens if one of the parties loses synchronization?

Scenario 1 - Server out of sync? The protocol server sends a command to the protocol client and ask to mutate an object. The command _reads object version 1 in an attempt to write version 2 while the protocol client possess version 5 in such cases what should the client do? fall back to the server version?

Scenario 2 - Client out of sync? The protocol client sends a command that reads object version 1 and willing to mutate it onto version 2. The server the last version is already version 5. In that case, since the server is the source of truth how the client is supposed to fetch the diff and resync it's channel state? (versions 2-5)

Scenario 3 - One of the parties lost channel state completely (disaster recovery)

  1. When it is back up, how can it know it even had a channel with another VASP? maybe a channel health check mechanism is required here.
  2. If it knows about the channel, how can he ask the other side to help him reconstruct its state?

udirom avatar Oct 08 '20 12:10 udirom

Versioning exists on an individual basis on objects and isn't a global version. Which means that for TR specifically, it's of little importance actually (or at least not any more importance than someone losing their entire DB which is already fairly painful, but doesn't prevent communications from continuing). If this were a global state, then you are correct that it would break the channel. Let's take an example and explore what happens:

  • Two VASPs, C and D
  • C creates a TR object with state of abcd and sends it to D who responds with OK. This request would have _reads empty because it reads nothing and _writes of abcd since it writes the object state for this one object as abcd
  • D (or C) loses their entire DB
  • C creates another TR object for a new transaction. It has _reads empty since this is a new object and hence not dependent upon any prior state of this object. It has _writes of cbad for this new object. D is able to handle this fine because what it cares about is that it has the object state contained by _reads. Since _reads is empty, it means that this command has no prior dependencies.
  • C tries to update the first object abcd. It sends a command with _reads of abcd and _writes of 1234. This will fail since D no longer has this state due to the dropped database. C will need to re-create this object if they wish to submit it on-chain (or D will likely refund it if it's submitted on-chain). So there is a set of transactions that will have issues, but it's fairly short-lived. Essentially only objects which were open at the time that the database was dropped (assuming there was no backup from which to restore).

As you can see, there isn't actually a requirement to know that you had a channel with a VASP previously in the case that you lose your database since the only thing that you lose is open transactions which can be re-instantiated when the sending party sees an error response. We considered a state synchronization, but since these are individual object states, the scope of the failure is pretty limited and the advantage of simplicity vs preventing all failures felt like a reasonable tradeoff.

kphfb avatar Oct 09 '20 17:10 kphfb

@kphfb :

  1. Does an "object" exists only during the lifetime of a channel between two parties?
  2. Is it logically shared, but each party stores in its local DB the latest version it knows?
  3. Would it be advisable for parties on both sides of a channel to move objects to a durable store once its response on a channel allows a transaction to be submitted to the blockchain?

dahliamalkhi avatar Oct 10 '20 04:10 dahliamalkhi

@dahliamalkhi

  1. An object exists from when it was created until when it is finalized on chain or closed out via off-chain APIs (i.e. canceled)
  2. Correct
  3. Yes - although every operation should be updated in a database always

kphfb avatar Oct 12 '20 17:10 kphfb

Hi all,

In the past we discussed making a set of read commands that allow VASPs to read objects from each other. This can be used in the re-sync use-case among others.

In fact we have in the local API two function that we could provide over the network to the other side as commands: they are get_payment_by_ref and get_payment_history_by_ref here: https://github.com/libra/off-chain-reference/blob/eba72f53d4962fccd2033c98500c06bfaaacf4e2/src/offchainapi/core.py#L215

As @kphfb points out we thought this is not a necessity, but it is an option if people feel it is useful.

Best,

George

gdanezis avatar Oct 26 '20 13:10 gdanezis