FluidFramework icon indicating copy to clipboard operation
FluidFramework copied to clipboard

Fluid LLM - SharedTree Branching & Merging

Open seanimam opened this issue 1 year ago • 4 comments

Overview

Adds an new experimental proof of concept library for simplifying the process of allowing an LLM to collaborate live with humans by making changes directly to the state of a SharedTree.

Essentially, what we're doing here is sending the current state of a shared tree as JSON to an LLM and asking it to make changes. The LLM returns the current state of the SharedTree you provided with any modifications as JSON - essentially a divergent branch of the shared tree. This library compares the two branches, produces a series of diffs and can apply them, essentially merging the two branches.

Eventually, SharedTree will have this behavior natively with branching & merging but until then this library is a short term solution.

File Structure & Class Explanation

  • fluid-llm |-> object-diff: Old package that is a fork of micro-diff with some small tweaks. You can ignore this -- it will probably get removed. |-> shared-tree-diff: core package of this library
    • |-> SharedTreeBranchManager.ts : Enforces schema between branches, produces diffs, and handles applying diffs.
    • |-> sharedTreeDiff.ts : Produces a series a diff when comparing two SharedTree nodes or javascript objects.

Quick Overview

You can take a look at /test/SharedTreeBranchManager.spec.ts for examples of how everything comes together. E.G: image

sharedTreeDiff.ts

This file provides sharedTreeDiff() which produces a list of "diffs" when comparing two different SharedTree's. The core logic was originally forked from micro-diff https://www.npmjs.com/package/microdiff.

How each type of diff (Create, Change, Delete) are create is straight forward with the exception of the "move" diff. For move diffs we use the unique identifier of an object (e.g. an ID field) to provide object permanence. At this time it's only applicable to objects within an array but it allows you to track a node being moved to a difference index. Without "moves", a node at a new index in a separate branch would be considered a Create.

SharedTreeBranchManager.ts

This class handles producing diffs using sharedTreeDiff() as well as actually applying said diffs. It also uses ZOD to enforce schema to ensure that branches returned by LLM's don't contain unexpected changes. Applying diffs is not as straightfoward as you think and there is a proper order to applying them in order to create the expected end state.

Without this library, the naive method would be to simply completely replace the old SharedTree with the new one. This is not performant and doesn't play well with SharedTree event listeners which tie into the front-end re-render event loop. In contrast, this library allows to apply changes piecemeal, preserving the original tree and producing a series of changes that is very similar to human changes.

Reviewer Guidance

  • Not all of the code is fully tested. I will wait for some review rounds to solidify the design approach before adding the remaining testing. Particularly around the SharedTreeBranchManager methods.

seanimam avatar Aug 29 '24 17:08 seanimam

We should definitely get Taylor or someone from the Tree team to look this over too, since it is related to the tree APIs.

Josmithr avatar Aug 29 '24 18:08 Josmithr

@fluid-example/bundle-size-tests: +245 Bytes
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 460.26 KB 460.29 KB +35 Bytes
azureClient.js 558.25 KB 558.29 KB +49 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 261.02 KB 261.04 KB +14 Bytes
fluidFramework.js 401.41 KB 401.42 KB +14 Bytes
loader.js 134.24 KB 134.25 KB +14 Bytes
map.js 42.43 KB 42.44 KB +7 Bytes
matrix.js 146.58 KB 146.58 KB +7 Bytes
odspClient.js 525.54 KB 525.58 KB +49 Bytes
odspDriver.js 97.85 KB 97.87 KB +21 Bytes
odspPrefetchSnapshot.js 42.81 KB 42.82 KB +14 Bytes
sharedString.js 163.3 KB 163.31 KB +7 Bytes
sharedTree.js 391.87 KB 391.88 KB +7 Bytes
Total Size 3.3 MB 3.3 MB +245 Bytes

Baseline commit: 51ac6498c979559f94d5892fe3431554f64ebc39

Generated by :no_entry_sign: dangerJS against 9402c9378f08e97d7755f467f2f40ad1183cd235

msfluid-bot avatar Aug 29 '24 18:08 msfluid-bot

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> [email protected] ci:linkcheck /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test ci:start 1313 linkcheck:full

1: starting server using command "npm run ci:start"
and when url "[ 'http://127.0.0.1:1313' ]" is responding with HTTP status code 200
running tests using command "npm run linkcheck:full"


> [email protected] ci:start
> http-server ./public --port 1313 --silent


> [email protected] linkcheck:full
> npm run linkcheck:fast -- --external


> [email protected] linkcheck:fast
> linkcheck http://localhost:1313 --skip-file skipped-urls.txt --external

Crawling...

Stats:
  405684 links
    3154 destination URLs
       2 URLs ignored
       0 warnings
       0 errors


github-actions[bot] avatar Sep 13 '24 16:09 github-actions[bot]

Work on this space continued in https://github.com/microsoft/FluidFramework/pull/22732

alexvy86 avatar Oct 08 '24 16:10 alexvy86

@seanimam is there anything from here that we still want/need to bring in to ai-collab?

alexvy86 avatar Oct 21 '24 17:10 alexvy86

This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework!