substrate icon indicating copy to clipboard operation
substrate copied to clipboard

Child trie and state machine refactors

Open cheme opened this issue 2 years ago • 10 comments

This PR expose the refactor from my current transient storage branch: https://github.com/cheme/substrate/tree/transient).

The transient branch changeset is big too and I want to split it in different PR (and LOT of testing is still needed in the target branch, and maybe taking the Rfc/PPP process).

This first PR is focusing on substrate changes with no additional functionalities. This will allow to focus this review on backward compatibility. But some changes merely make sense when related to the full change set and consulting the mentioned full branch may be useful.

No need to merge this right now, it is more a preparatory PR to show and discuss changes and direction, so low priority.

Looking at it, I also wonder if removing ChildInfo enum for just name or ChildType + name in read api would be better.

In this PR

  • Change sp-externalities:

    • storage access functions becomes &mut function: all in all this is not strictly necessary. - by allowing to write on read access allow evolution such as: - append optimizing: only write the appended content on write (store a Vec<Append>), and only resolve (concatenate and cache the changes) on read (for https://github.com/paritytech/polkadot-sdk/issues/30). - access analysis: cache access info.
    • value returns from Vec to Cow<[u8]>: this is rather incomplete and may be good to rollback at this point. Simply in sp_io we end up doing into own (would need some change to wasm interface proc macro to pass function context in the with_externality scope and return the u64 there, and only for the case of wasm calling host: a bit beyond this pr scope).
    • do not yet allow local block caching without locking or inner-mutability, would need to change state-machine backend trait similarily, but a few things in place makes it hard to do (I stopped on the runtime fetcher (need clone on non mutuable), it is only this, there may be easy way to handle this by adding runtime fetching to the state-machine backend, but again a bit beyond this PR scope.
  • remove change trie extrinsic indexing in state-machine

In full branch

Generally I am not very happy with multiplexing some function under ChildInfo.

  • Blob and Btree transient storage are part of the child tree mechanism. This choice could allow easier implementation of non transient variant, but there may be some awkward interfacing (choosing between a generit child state function or a dedicated one).

  • Blob

    • chunked data (every changes are stored by chunks).

      • A big concern here is reconstruction (concatenating all chunks), this could be cached with the mutability change (not implemented yet).
        • Chunk size is a bit random too, my only concern was to make it small enough and still aligned with most chunks (eg the chunk of blake3 tree).
    • hashing is done on new host function that keep trace of hasher: this is very discutable

      • blob changes are still in memory when using cumulus: may want to have them also in host side.
        • the hasher state is stored in the host memory (in statemachine overlay).
      • polkadot should use the host function (and have a change overlay attach when executing pvf).
  • api

    • mix of encoded non encoded hashes, is a bit ~

polkadot companion: https://github.com/paritytech/polkadot/pull/6550 cumulus companion: https://github.com/paritytech/cumulus/pull/2087

cheme avatar Dec 22 '22 13:12 cheme

Putting this out of draft, some change can be a bit controversial (can rollback them and specialize changes to the transient storage api only):

  • [] storage function using range of bytes as input parameter (and range not using rust trait).
  • [] storage function returning Cow<[u8]> to avoid instantiating a Vec when hitting the change overlay. Which in practice will be instantiated on the wasm boundary.
  • [] externality switching most & function to &mut, but not state machine backend (in practice we got often write on read access but some of those are part of state machine backend).
  • [] moving partially to use DefaultChild rather than ChildInfo, generally specializing to a given child type (considering child state can have very different api). But at the same time target branch keep using child info in state machine (sometime it feels a bit forcefull). -> might just go fully to avoid ChildInfo as much as possible already.
  • [] OverlayChanges renamed to Changes: could only be local to state machine only (and exported as OverlayChanges).

cheme avatar Jan 11 '23 17:01 cheme

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Apr 20 '23 19:04 stale[bot]

+1 for removing ChildInfo enum. Will make code so much easier to follow. DefaultChild is a bit of a confusing name. It is not a child after all, but a child tree as far a I understand. Are there any other type of child tries other than "default" planned in the immediate future?

arkpar avatar Apr 20 '23 21:04 arkpar

+1 for removing ChildInfo enum. Will make code so much easier to follow. DefaultChild is a bit of a confusing name. It is not a child after all, but a child tree as far a I understand.

Yes it as actually more a DefaultMerkleTrie or DefaultTrie, actuallly TrieDefault (a merkle trie using the default substrate state) would make more sense or TrieTop (a merkle trie using the top trie state definition). TrieTop might actually not be future proof, so I think I will switch to TrieDefault unless I got a better idea. (need to first rebase on master has branch went stale and I also got changes from https://github.com/w3f/PPPs/pull/11 to report on https://github.com/cheme/substrate/tree/transient (which also uses this branch))

Are there any other type of child tries other than "default" planned in the immediate future?

There is the transient btree storage and the blob storage (for the second one it is more a child state). Both are actually not child in the sense that their root or hash are not always attached. It is more an attached transient state. In my implementation (IIRC), the choice is something that can be discussed. It is more attached content (unless runtime write the root or hash in state and archive the transient info then it can be accessed as child state from a rpc or other). Going in the child state definition direction is only really interesting in case it could switch to being persisted, but this is really not even plan (would need a different db storage). So only benefit is to get something a bit similar in code, but there is only a few functions that handle the enum due to different api, and I remember thinking it would be good to separate them in some case (to make things more readable).

cheme avatar Apr 21 '23 07:04 cheme

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 23 '23 09:06 stale[bot]

will soon.

cheme avatar Jun 23 '23 09:06 cheme

This PR expose the refactor from my current transient storage branch: https://github.com/cheme/substrate/tree/transient).

This branch will effectively address https://github.com/paritytech/polkadot-sdk/issues/245?

kianenigma avatar Jun 27 '23 09:06 kianenigma

This PR expose the refactor from my current transient storage branch: https://github.com/cheme/substrate/tree/transient).

This branch will effectively address paritytech/polkadot-sdk#245?

Yes it target this one, but it is a version with many host function, I am testing a different approach in branch "https://github.com/cheme/substrate/tree/transient-global-mem" where I just expose a global wasm memory (change overlay with transaction support) and a few host function (for hashing and storing).

cheme avatar Jun 27 '23 09:06 cheme

Hey, is anyone still working on this? Due to the inactivity this issue has been automatically marked as stale. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 29 '23 02:07 stale[bot]

https://github.com/paritytech/substrate/pull/13940 is more prioritary for now, this can be close until then I think.

cheme avatar Jul 30 '23 17:07 cheme