ref-fvm icon indicating copy to clipboard operation
ref-fvm copied to clipboard

addressing and loading wasm bytecode for actors

Open raulk opened this issue 3 years ago • 0 comments

Built-in actors Code CIDs

Today, built-in actors have synthetic Code CIDs. They are identified by an "IPLD raw" codec (0x55) and identity multihash of predetermined strings following the convention fil/<actor type>/<version as ordinal>, for example fil/account/5 which is the built-in account actor for actors v5. The reason for this special-casing is that, currently, there is no canonical, portable bytecode format for built-in actors. Nodes use Code CIDs to dispatch execution to the appropriate actor, and the correct version thereof.

Actor versioning

When evolving the Filecoin protocol, there are several versions that come into play:

  • Network version
  • Actor version
  • State tree version

The actor version is only bumped upon actor ABI breaking changes. Logic changes do not lead to bumping the actor version unless they come bundled with interface changes. That means that the fil/account/5 Code CID can serve infinite iterations of the actor's logic, as long as the external ABI remains constant. Logic changes are implemented through control flow, where the actor takes a local decision based on the network version in effect.

Furthermore, it is possible to deploy security, bug and performance fixes without even bumping the actor or network version, as long as they do not break consensus.

Today's CodeCID serves as a routing/dispatch table, but does not strictly represent the actual code to run.

Content-addressed WASM bytecode

First of all, we should define a multicodec for "WASM bytecode". Second, we need to figure out the IPLD schema for code.

Laying it out as a single blob is sufficient for M0 and M1 (built-in actors) because no code will be exchanged in the network, but for M2+ (user-deployed actors) code will be exchanged as part of deployment messages (and potentially during execution depending on the upcoming consensus models), so a chunked strategy will enable parallelized transfers across the network (e.g. via Bitswap).

It might be worthwhile to invest in symbol-wise variable chunking, where symbols end up in different chunks such that we can conduct symbol-level deduplication at the storage layer (e.g. reused functions).

Caching

The FVM should cache compiled WASM modules. The cache will be invalidated occasionally, either entirely, or by invalidating specific entries.

  • Automatically.
    • On WASM runtime/compiler upgrades.
    • On compiler configuration changes.
    • On changing "implicit logic" like gas accounting at the protocol level.
    • On garbage collection (e.g. code expires from the chain, if we decide to implement such policies; all actors associated with a code ID are destructed; or custom policies, e.g. LRU/LFU).
    • On built-in actor versions going out of scope on network upgrades.
  • Manual purging. The node operator can decide to purge at any time.

Caching logic for M0 and possibly M1 can be simple, as those milestones will only run built-in actors.

Design proposals

M0 (only built-in actors)

(We are still dealing with synthetic CIDs, and for correctness reasons, it's a bad idea to insert the WASM bytecode into the blockstore).

  1. The bytecode of built-in actors is bundled inside the node's binary (e.g. Lotus) as embedded data.
  2. There's a lookup table that resolves requests for a CodeCID of the kind fil/account/5 to the embedded bytecode.
  3. When the FVM runs an actor, it queries its module cache. a. If it's a hit, no bytecode is requested from the node, and execution proceeds with the cached module. b. Otherwise, the bytecode is requested through a GetCode extern call. The GetCode call on the node side uses the lookup table defined in (2) to return the appropriate bytecode.

M1 (only built-in actors)

The above flow applies, but at this stage we should deprecate synthetic CodeCIDs and move to an actual content-addressed code scheme. There are two options for shipping code for built-in actors: embedded in the binary, or outside the binary. What gets tricky now is shipping built-in actor improvements that do not break consensus (and for which a network upgrade wasn't necessary), as before these would not imply CodeCID changes, but now they will.

I wonder if we need a "remapping/override table", so that node operators can override CodeCIDs in the state tree with alternative code.

M2+ (support for user-deployed actors)

WIP; I didn't get to think thoroughly through this. The following are some rough notes.

  • I assume there will be a gas cost for byte of WASM bytecode deployed in the network.
  • I assume there will be a compilation gas cost. Do we compile as soon as code is deployed? How do we measure the cost of the compilation? Do we persist the compilation output in the module cache?
  • Where do we store the bytecode? In the state blockstore, or do we segregate code into its own blockstore (and create abstractions for traversing links across the state and code blockstores, for things like GC)?

raulk avatar Dec 12 '21 18:12 raulk