ref-fvm icon indicating copy to clipboard operation
ref-fvm copied to clipboard

Technical Design: Link Database

Open Stebalien opened this issue 3 years ago • 4 comments

The FVM has several operations that require knowledge of the links between IPLD blocks:

  • When a block is opened, the FVM needs to know what it points to, so it can add those CIDs to the reachable set.
  • When a the state-tree is flushed to disk at the end of an epoch, the FVM needs to traverse the state-tree.
  • Garbage collection and/or reference counting will likely require knowledge of IPLD links as well.

To avoid having to re-parse the block for every such operation, the FVM may want to save a list of "links" along with the block itself. In addition to improving performance, this would allow the FVM accurately to charge for block parsing once, up-front.

Implementation

An FVM implementation would store links in a key-value store by concatenating the CIDs together and storing them as a single blob, associated with the block's CID. Additionally, ipld::block_create must charge for the persistence (per-byte) of this "link block" along with the storage of the IPLD block itself.

However, this has several drawbacks:

  • We end up storing unnecessary data. Ideally, we'd just store a list of "offsets" in the blocks where links appear, but that doesn't generalize to all formats.
  • It's unclear if parsing this "link block" is going to be substantially faster than just parsing the block in some cases.
  • This requires two datastore read/writes per actor read/write. An alternative would be to concatenate the "link block" with the actual block when storing it, but that would require an invasive change to the blockstore.

Alternatives

An alternative is to:

  1. Parse/validate the block on creation, storing the links in-memory but not persisting them.
  2. Parse/validate the block on open, also storing the links.

The downside is that we'd have to charge for parsing both when we open a block, and when we create it. However, we'd be able to use the parsed links on flush, set_root, etc.

Given the difficulty of implementing (and integrating) a full "link database" with lotus, we're likely going to go with this alternative.

Stebalien avatar Jun 28 '22 05:06 Stebalien

As you note, we don't have any metrics/facts that would warrant pursuing the linkstore as of today.

Without a clear picture of what other functionalities the linkstore would power (e.g. garbage collection), we can only consider it an optimization. Pursuing it now would be an effort of premature optimization.

For this reason, I agree with you that parsing on-demand and caching in memory seems like the most rational approach today.

If we ever decide to go in this direction, I would push back against a separate physical linkstore, and nudge us towards colocating link metadata with the block itself. A separate physical linkstore may even nullify the performance gain we are hoping to get, due to:

  1. Doubling the disk IO overhead per blockstore operation; i.e. disk latency.
  2. Doubling the database memory overhead (database indices are not cheap).

We should also avoid doubling the number of FFI calls, ie. we should adapt the blockstore calls to accept and dispense link tracking information.

If we were to implement this at some point, I would strongly advise to colocate the link data within the main blockstore, by wrapping the block data in a data structure. In Lotus, we could use Badger metadata to track when blockstore entries have been "upgraded" to track link information.

raulk avatar Jun 29 '22 15:06 raulk

(Didn't intend to close, not sure what happened!)

raulk avatar Jun 29 '22 15:06 raulk

Ok, GitHub broke Cmd+Enter.

raulk avatar Jun 29 '22 15:06 raulk

Confirmed.

raulk avatar Jun 29 '22 15:06 raulk