hydra Backup & restore the state of a Hydra Head

What & Why

Currently, there is no mechanism to recover the current state of a Hydra Head when restarting the hydra-node (e.g. following a crash). As a consequence, the Hydra Head can't continue processing transactions and, even worse, trust is required between the participants to close the Head in a non-adversarial manner.

Persisting the state of the Hydra Head on disk and restoring it on restart will allow hydra-node to resume operations and recover from unexpected downtime.

Out of scope: Longer down times (depending on the contestation period, a Hydra protocol parameter) are not covered!

Requirements

The hydra-node can be restarted without losing it's knowledge of Hydra Head(s)
An open Hydra Head can always be closed after restart
An initiated Hydra Head can always be aborted after restart
A restarted Hydra node (ideally) can progress in L2 transaction processing
It's acceptable that a Hydra Head might still not progress, e.g because of missed network events (related #188)

To be discussed

Technical detail: Shall we store the accumulated head state or all incoming events before they are processed?
Storage format: backward-compatibility, introspect-ability ("white-box" & audit a running Hydra Head?)

Tasks

[ ] #554
[x] #257
[ ] #541

Jan 30 '22 17:01 ch1bo

Added some bullets for some subtasks of this.

Apr 19 '22 14:04 ch1bo

One thing to consider: When we restore from persistence, we would need to know at what ChainPoint we have had been before and resynchronize with the node from that point onward. Otherwise, we might "miss" chain events?

Sep 09 '22 07:09 ch1bo

Also, rollbacks:

Can it be that we have been temporarily on a fork, stop the node and restore it wanting to synchronize from the same ChainPoint. The cardano-node, in the meantime, has been rolled-back off the fork. When we re-connect to it.. would we see a rollback or no intersection? How to handle this? With/without storing the chain state?

Furthermore, if we have no intersection, we would need to know alternative past ChainPoints where we would want to synchronize from, i.e. this info needs to be in the persisted data. Right now, that info would be in the linked list of ChainStateAt.

Sep 09 '22 07:09 ch1bo

Shall we figure out how to replay some events on restart to api customer?

Oct 24 '22 10:10 pgrange

hydra hydra copied to clipboard

Backup & restore the state of a Hydra Head

What & Why

Requirements

To be discussed

Tasks

hydra
hydra copied to clipboard