FluidFramework icon indicating copy to clipboard operation
FluidFramework copied to clipboard

Exploration prototype to support datastore nesting

Open vladsud opened this issue 1 year ago • 0 comments

This is an exploration into how we can support nested data stores. While it is super important to answer question on "why it's needed", I'd love to discuss it outside of this PR, and focus here on how to get such capability, and best way to accomplish it.

I see two possible ways:

  1. Change FluidDataStoreRuntime to support data stores as children.
    • That would require some form of unification of channel interfaces / interaction (IChannel & IFluidDataStoreChannel)
    • This also puts us on the path of reimplementing various container runtime features in the future, like aliasing of channels.
  2. Make DataStores class closer to IFluidDataStoreChannel requirements and thus be able to instantiate it as a child of DataStores.
    • This prototype looks into this direction.

I've started poking at # 2, and it turned out not that bad. I was able to refactor enough code in 2-3 days that all existing flows seems to continue to work, and (with some hacks) I can build a minimalistic test case for nested data stores (see DataStoresNested.spec.ts).

Key things that need to be done in order to make it real:

  1. I introduced IFluidParentContext as a subset of IFluidDataStoreContext to incrementally keep moving forward. We need to have just one interface.
  2. API surface - I have not spent any time defining API surface for nested DataStores - as you can see from UT, I simply hacked things around. More refactoring (moving ContainerRuntime implementations into DataStores and thus exposing same APIs) is likely first step.
    • That said, I totally expect that we will not expose this functionality in a form of generic API first. I can take advantage of that structure by extending DataStores class and exposing custom API for specific goals (DB workstream) only.
  3. Somewhat related to above, but I did not spend much time thinking about visibility change flows / aliasing. This needs to be thought through.

Note that this direction does not allow us to have DDS and DataStore under same parent (yet). Some level of unification of IChannel & IFluidDataStoreChannel is required, but this could be done independently, when needed. I do not think it will be that hard (we migth either change all existing DDSs or build an adapter that wraps IChannel, or both). That said, getting rid of delayed handle / object attachment flow would be a welcome change / simplification on this path.

The most interesting point of discussion (if we go that direction) - what is the future of FluidDataStoreRuntime class? Storage layout of FluidDataStoreRuntime is different from how DataStores summarizes (for example - attributes blob), and thus you can't just substitute one for another even if functionality-wise DataStores is superset of FluidDataStoreRuntime I hope that some cheap adapters could be built here (possibly snapshot transformation adapter) to support back-compat and not need to maintain two different implementations. This could be done later / when needed.

Some note on implementation: Some of the interactions between DataStores and ContainerRuntime become weirder (packing & unpacking of signals / ops). This is due to a fact that interface shapes are different between layers. We can't easily change IFluidDataStoreChannel & context (due to compat concerns), and thus I had to adapt DataStore to those patterns, which causes some weirdness at ContainerRuntime layer. It's not that bad, and comments should help / provide some explanation.

Curious to see what reactions this draft generates! :)

vladsud avatar Feb 18 '24 19:02 vladsud