libplanet icon indicating copy to clipboard operation
libplanet copied to clipboard

Make illegal states of `Block<T>` & `Transaction<T>` unrepresentable

Open dahlia opened this issue 4 years ago • 6 comments

Related: #1146.

To be elaborated.

dahlia avatar Jan 18 '21 08:01 dahlia

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

stale[bot] avatar Mar 19 '21 22:03 stale[bot]

Preamble

Well, this is really getting out of hand.

Indeed, Block<T> is being used in multiple stages without regards to its validity. There are historical reasons why this has come to be the case, but it might be preferable to overhaul the whole class at this point.

After a careful inspection, there seems to be two major issues:

  • The content of Block<T> is largely divided into two groups.
    • Those not dependent on PreEvaluationHash and those dependent on PreEvaluationHash.
  • Evaluation process is rather convoluted.

To summarize the issues even further, in essence, the design of Block<T> seems to have evolved far too much for its surrounding support classes/methods, which are way too generic, to be useful. I suggest we decide asap whether to systematically support "two-stage", or even "multi-stage", Block<T>.

There are also policy level complications arising from having StateRootHash and Hash derived from PreEvaluationHash, due to both not being secured by PoW for the newest block. Although this is not entirely a separate issue, these should be discussed more in depth if necessary.

For everyone's sanity, and especially mine, some overview of the creation processes of a Block<T> might be in order.

Creation of Block<T>

There are mainly two ways for a Block<T> instance to be created: mining and deserialization.

Mining

This is where a Block<T> gets created through mining. In summary, Block<T>() constructor is called with appropriate Nonce value. However, the actual process is much more complicated.

  • For the creation of a new Block<T> through mining, Block<T>.Mine() is called from BlockChain<T>.MineBlock().
  • Block<T>.Mine() returns a new Block<T> instance only containing minimal amount of information without either StateRootHash or Hash.
    • Transaction<T>'s are ordered by Id before Block<T>() is called.
    • Inside Block<T>(), Transaction<T> is ordered by Id once again in preparation for calculating PreEvaluationHash on the fly.
      • There is an error where SerializeForHash() is called to include StateRootHash when calculating PreEvaluationHash. Since StateRootHash has not yet been assigned, Block<T>.StateRootHash of null gets converted to BlockHeader.StateRootHash of Empty, and then to non-existance in its dictionary representation, resulting in accidental proper evaluation of PreEvaluationHash.
    • Once PreEvaluationHash is calculated in Block<T>(), StateRootHash is assigned null by default, and Block<T>.Hash is calculated via calling SerializeForHash() again, which results in having the same value as PreEvaluationHash.
      • Meanwhile, Block<T>.Header is called twice to create BlockHeader on the fly.
    • Once PreEvaluationHash, "StateRootHash", and "Hash" are assigned, Transaction<T>s are reordered to its execution order, which depend on PreEvaluationHash value as random seed.
      • This is a rather heavy operation, or so I have heard, so should be used with caution.
  • Once Block<T>.Mine() has returned such partial Block<T> instance to BlockChain<T>.MineBlock(), local IReadOnlyList<ActionEvaluation> is generated by calling BlockEvaluator.EvaluateActions() with said Block<T> and a StateCompleterSet<T> in preparation for getting the StateRootHash of the Block<T>.
    • With some null check, StateCompleterSet<T> is split to AccountStateGetter and AccountBalanceGetter and used to execute every IAction in the block via Block<T>.Evaluate().
      • Inside Block<T>.Evaluate(), there is a redundant null check for AccountStateGetter and AccountBalanceGetter.
      • Block<T>.Header is then validated. Aside from on the fly property being called yet again, this does not seem to be the best place to validate the Header of a Block<T>.
        • Once Block<T>.Header is validated, IActions in each Transaction<T> is evaluated by calling EvalutateActionsPerTx().
        • Yet another null check for AccountStateGetter and AccountBalanceGetter is performed.
        • For each Transaction<T>, which is in evaluation order, even though this is all happening inside a "partially" constructed Block<T> instance, Transaction<T>.EvaluateActionsGradually() is called to evaluate IActions inside a Transaction<T>.
          • Inside Transaction<T>.EvaluateActionsGradually(), everything gets passed on to ActionEvaluation.EvaluateActionsGradually(), which is a method from some external evaluation helper. This completes the cycle of going from an external helper method of BlockEvaluator to an internal method of Block<T> then to Transaction<T> then finally to an external helper method of ActionEvaluation.
    • Once Block<T>.Evaluate() is complete, Block<T> as an action is evaluated via BlockEvaluator.EvaluateBlockAction().
  • Now that necessary IReadOnlyList<ActionEvaluation> is generated, BlockChain<T>.SetStates() is called to obtain StateRootHash for the said Block<T> instance.
    • Inside BlockChain<T>.SetStates(), this is simply redirected to StateStore.SetStates().
      • To calculate StateRootHash, TrieStateStore.EvalState() is called. Note EvalState() is not part of IStateStore interface.
      • Evaluated StateRootHash is stored with Block<T>.Hash which is the same as PreEvaluationHash at this point.
  • Once StateRootHash is finally obtained, a new Block<T> instance is created from the old Block<T> and this StateRootHash.
    • The same constructor is called, only now with StateRootHash provided. The entire process of initial Block<T> creation above is repeated.
  • With newly created Block<T>, StateStore.SetStates() is called once again to set evaluated Block<T> states with a proper Hash.

Deserialization

This is when data is received over the network and the corresponding Block<T> is instantiated from the serialized data. The process is much more straightforward since all the necessary values are already provided. Note that the deserialization process does not share any usage of a constructor with the mining process. This has left a gap in validation where the receiving node blindly trusts the transaction order transmitted by another node.

greymistcube avatar May 16 '21 15:05 greymistcube

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

stale[bot] avatar Jul 16 '21 12:07 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

stale[bot] avatar Nov 09 '21 20:11 stale[bot]

I'm recently working on #1979, addresses #1974, and I'm roughly shaping an idea on all possible stages of transactions. It's parallel to stages on blocks drawn at #1492 in general, except that I'm trying to prefer composition over implementation inheritance this time.

Their stages can be placed on a 2×2 matrix, divided by two aspects: Signed or unsigned yet? Having actions or no actions? Here's the matrix:

Stages Lacking actions Having actions
Unsigned ITxMetadata[^1] ← TxMetadata[^1] ITxContent<T>TxContent<T>[^2]
Signed ISignedTxMetadata ITransaction<T> ←[^3] Transaction<T>

Note that some interfaces/classes are not implemented yet. I'm going to make a patch to complete the above matrix. For more details, here's a class diagram as well:

classDiagram
    ITxMetadata       <|-- ISignedTxMetadata : extends
    ITxMetadata       <|-- ITxContent~T~     : extends
    ISignedTxMetadata <|-- ITxExcerpt~T~     : extends
    ITxMetadata       <|-- TxMetadata        : implements
    ITxContent~T~     <|-- TxContent~T~      : implements
    ITxContent~T~     <|-- Transaction~T~    : implements
    ITxExcerpt~T~     <|-- Transaction~T~    : implements
    TxMetadata        *-- TxContent~T~       : composition
    TxContent~T~      *-- Transaction~T~     : composition
    class ITxMetadata {
        <<interface>>
        +long Nonce
        +Address Signer
        +DateTimeOffset Timestamp
        +PublicKey PublicKey
        +BlockHash? GenesisHash
        +IImmutableSet~Address~ UpdatedAddresses
    }
    class ISignedTxMetadata {
        <<interface>>
        +ImmutableArray~byte~ Signature
    }
    class ITxContent~T~ {
        <<interface>>
        +IImmutableList~T~ Actions
    }
    class ITxExcerpt~T~ {
        <<interface>>
        +TxId Id
    }
    class TxMetadata {
        <<sealed class>>
        +ToBencodex(IEnumerable~IAction~ actions, ImmutableArray~byte~? signature) Dictionary
    }
    class TxContent~T~ {
        <<sealed class>>
        -TxMetadata _metadata
        +Sign() Transaction~T~
        +ToBencodex(ImmutableArray~byte~? signature) Dictionary
    }
    class Transaction~T~ {
        <<sealed class>>
        -TxContent~T~ _content
        +ToBencodex() Dictionary
    }

FYI I'm going to let Transaction<T> have a TxMetadata instead of TxContent<T> on #1978 as there is no ITxContent<T> & TxContent<T> yet.

[^1]: Already implemented in #1978. [^2]: TxContent<T>.ToBencodex() method will be able to replace RawTransaction. [^3]: In #1978, Transaction<T> only implements ITxMetadata, because there is no ITxContent<T> yet. It will be fixed in the future.

dahlia avatar May 24 '22 01:05 dahlia

This issue has been automatically marked as stale because it has not had recent activity. Thank you for your contributions.

stale[bot] avatar Jul 31 '22 01:07 stale[bot]