specs
specs copied to clipboard
Simpler faults and recovery if penalties are reduced
The spec currently contains some complicated data structures for representing layers of faults and "recovered" sectors. Explanation is lacking, but I think that all exists to represent the case that a sector was unavailable for some period of time fully contained within a single proving period, but available again before it ended. This is predicated on the proof being a long running computation that samples sectors throughout the process.
- Is the proof construction highly likely to have that property? I understand there are still different constructions under consideration, but my takeaway from @nicola’s brief description of "rational post" with no VDF is that it doesn't. "beacon post" would, though.
- Is this complexity really worthwhile? If we avoid heavy penalties for faults (#407) the penalty for losing a sector becomes only the need to re-commit it (#408), any maybe pay up to deal clients
- Does this really solve the problem? In my understanding, it doesn't solve the problem in the case that the sector unavailability crosses the proving period boundary, even if the fault duration is relatively short.
If the penalties of a fault are small and a transient failure can be recovered cheaply (#408), I propose we just allow the PoSt to come with a single list of sector failed sector ids and no "recovered" set. The sectors can be re-committed (possibly even before the proving period ends!). Taken all the way, we might not even need a distinguished done set if the storage power measurement is secure without relying on heavy penalties (#403).
There might be some proof construction internals that are opaque to me that necessitate the layered faults independent of these considerations.
Originally asked in slack cc @sternhenri @dignifiedquire @whyrusleeping
Also we might not need late fee mechanisms if a late PoSt can be recovered by the gas cost of re-committing sectors (though for very large miners this could be a lot of messages). Just a thought.
If posts are actually a long running process (VDFPost), then yes, we need the ability to continue in the face of a temporary failure, as waiting for the failure to resolve itself is likely to make us late on our submissions.
If we move to rationalPost, then this is not necessary, as changes to the fault set should be submitted to the chain as they are enountered (or not if you want them to be temporary faults).