FluidFramework icon indicating copy to clipboard operation
FluidFramework copied to clipboard

Fix eventual consistency issues in SharedDirectory

Open jatgarg opened this issue 3 years ago • 3 comments

Description

Thanks @anthony-murphy for the draft PR you raised earlier.

1.) Add a list of clientids which have created this directory. Add a directory seq number which is set to sequence number of create op of that directory or 0 in case it is loaded frpm snapshot. Add a shouldProcessMessage which only process ops if the remote message is either from the creator of directory or this directory was created when container was detached or in case this directory is already live(known to other clients) and the op was created after the directory was created.

2.) If a delete sub directory op comes and there is already local pending create/delete subdirectory, then instead of ignoring the delete op, clear the non-pending keys. Also reset the creator clientids and seq number to 0 as the new create op will reset them and we will also not process any set/clear/delete ops before the new create op.

3.) If there is pending delete op for this subDirectory, then don't apply the storage ops as we are going to delete this subDirectory. The problem it solves is that if client again created the sub directory after deleting, then we will end up applying that op without this change.

jatgarg avatar Aug 22 '22 23:08 jatgarg

@fluid-example/bundle-size-tests: +2.51 KB
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 397.3 KB 399.81 KB +2.51 KB
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 196.25 KB 196.25 KB No change
loader.js 153.76 KB 153.76 KB No change
map.js 42.84 KB 42.84 KB No change
matrix.js 131.78 KB 131.78 KB No change
odspDriver.js 151.87 KB 151.87 KB No change
odspPrefetchSnapshot.js 39.77 KB 39.77 KB No change
sharedString.js 152.99 KB 152.99 KB No change
Total Size 1.27 MB 1.27 MB +2.51 KB

Baseline commit: 44bb24d6b441ae5bd52c71a481e4c1cb75df5ff3

Generated by :no_entry_sign: dangerJS against 9ecd0f00568d9b2cad612474082b212665dcd466

msfluid-bot avatar Aug 23 '22 00:08 msfluid-bot

@anthony-murphy Take a look when you get time.

jatgarg avatar Aug 26 '22 17:08 jatgarg

@vladsud Take a look at this when you get time.

jatgarg avatar Sep 12 '22 20:09 jatgarg

@vladsud Take another look when you get time.

jatgarg avatar Oct 06 '22 19:10 jatgarg

I'm not comfortable approving this review without the addition of non-deterministic stress tests. These have proven to be essential at finding complex interactions between clients where eventual consistency doesn't hold. we should check in test like this independently. ideally, they would be failing so skipped, as we know there are problems, and a then be enabled in this review showing they now succeed.

This is probably not a small work item, and i'm happy to help write them.

@Abe27342 can you point to any documentation that exist for building new tests based on your framework?

anthony-murphy avatar Oct 14 '22 18:10 anthony-murphy

I'm not comfortable approving this review without the addition of non-deterministic stress tests. These have proven to be essential at finding complex interactions between clients where eventual consistency doesn't hold. we should check in test like this independently. ideally, they would be failing so skipped, as we know there are problems, and a then be enabled in this review showing they now succeed.

This is probably not a small work item, and i'm happy to help write them.

@Abe27342 can you point to any documentation that exist for building new tests based on your framework?

Sure, the package readme here is a pretty good starting point. There are some existing examples in @fluid-experimental/tree and @fluidframework/sequence (SharedTreeFuzz.ts and intervalCollection.fuzz.spec.ts are the file names IIRC). Both of those use stochastic-test-utils with a full set of fluid mocks, but if SharedDirectory supports reasonable collaborative scenarios without such mocks (e.g. like merge-tree) you could get away with less. Happy to answer any other questions you have.

Abe27342 avatar Oct 14 '22 19:10 Abe27342

I'm not comfortable approving this review without the addition of non-deterministic stress tests. These have proven to be essential at finding complex interactions between clients where eventual consistency doesn't hold. we should check in test like this independently. ideally, they would be failing so skipped, as we know there are problems, and a then be enabled in this review showing they now succeed. This is probably not a small work item, and i'm happy to help write them. @Abe27342

        Abram Sanderson (HE/HIM)
        FTE can you point to any documentation that exist for building new tests based on your framework?

Sure, the package readme here is a pretty good starting point. There are some existing examples in @fluid-experimental/tree and @fluidframework/sequence (SharedTreeFuzz.ts and intervalCollection.fuzz.spec.ts are the file names IIRC). Both of those use stochastic-test-utils with a full set of fluid mocks, but if SharedDirectory supports reasonable collaborative scenarios without such mocks (e.g. like merge-tree) you could get away with less. Happy to answer any other questions you have.

Thanks, I will take a look into these and try to build these first and then merge/fix this PR.

jatgarg avatar Oct 14 '22 19:10 jatgarg

This PR has been automatically marked as stale because it has had no activity for 60 days. It will be closed if no further activity occurs within 8 days of this comment. Thank you for your contributions to Fluid Framework!

ghost avatar Dec 14 '22 00:12 ghost