FluidFramework icon indicating copy to clipboard operation
FluidFramework copied to clipboard

Remove gratuitous JSON.stringify from parseHandles utility

Open markfields opened this issue 1 year ago • 8 comments

Description

We added the encode and decode functions onto IFluidSerializer a few years back, and updated the utility makeHandlesSerializable to use encode, but left the inverse utility parseHandles using the old (less efficient) way of doing serializer.parse(JSON.stringify(value)).

This PR switches to using serializer.encode which does an object walk itself, and avoids translating through string.

Reviewer Guidance

I'm assuming there is adequate test coverage of this code path. I also haven't run any benchmarks - @DLehenbauer any risk that somehow this is slower than the roundtrip through string?

Merge-Tree fuzz test failures

Merge-Tree fuzz tests seeds 29 and 47 are failing with assert 0x04f - "minSeq of collab window > target minSeq!". This is surprising but they both failed for two runs in a row, so something is going on here. I'm not able to look into it for now so converting to a draft PR.

markfields avatar Feb 20 '24 20:02 markfields

@fluid-example/bundle-size-tests: -140 Bytes
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 514.74 KB 514.71 KB -28 Bytes
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 249.17 KB 249.17 KB No change
loader.js 128.23 KB 128.23 KB No change
map.js 41.35 KB 41.32 KB -28 Bytes
matrix.js 143.61 KB 143.58 KB -28 Bytes
odspDriver.js 97.49 KB 97.49 KB No change
odspPrefetchSnapshot.js 41.91 KB 41.91 KB No change
sharedString.js 161.38 KB 161.35 KB -28 Bytes
sharedTree.js 330.21 KB 330.19 KB -28 Bytes
Total Size 1.81 MB 1.81 MB -140 Bytes

Baseline commit: bdca465afcdfb17143f9032dbee8da4825012003

Generated by :no_entry_sign: dangerJS against 0e98a95b169870d376f09972c07723aca3ca00a6

msfluid-bot avatar Mar 15 '24 22:03 msfluid-bot

@connorskees - Funny - this doesn't fail those fuzz tests anymore...

@DLehenbauer, @Abe27342 - Any thoughts on whether this is a safe change to make?

markfields avatar Mar 15 '24 22:03 markfields

@connorskees - Funny - this doesn't fail those fuzz tests anymore...

@DLehenbauer, @Abe27342 - Any thoughts on whether this is a safe change to make?

I'd want to understand the source of failure first. Do you remember exactly which tests were failing? i.e. were they the ones in sequence? or some of the test farms in merge-tree?

We've recently changed the scenarios fuzz tests exercise to support the stashed op flow, so ones built on top of the DDS fuzz harness won't be exercising the same scenarios they did 1 month ago, but the issue could still be prevalent (especially if it was only 2 seeds)

Abe27342 avatar Mar 18 '24 16:03 Abe27342

https://github.com/microsoft/FluidFramework/pull/19242 may have had an effect in making those tests no longer fail

connorskees avatar Mar 18 '24 16:03 connorskees

@Abe27342 can you link the commit(s) that changed the fuzz harness? I'd like to try the combo of Connor's PR + old fuzz harness and see which tweaked it.

markfields avatar Apr 22 '24 21:04 markfields

@vladsud - This is one gratuitous string roundtrip that I was talking about this morning.

markfields avatar Apr 22 '24 21:04 markfields

@markfields #20508 and various follow-ups to use random.handle() in different packages (these are made by Jillian, check on github), #20358, #19802, #19701, etc.... It'll be hard to narrow down

Abe27342 avatar Apr 22 '24 21:04 Abe27342

Is there some write up that tells us where we were, where we are today, and what else we can do to make things better? My main concern - when I did profile matrix 2 months back (at large scale), almost all the time in profiler was handle parsing. Should it be better now? I'll re-meassure, but having some English narrative somewhere with status would be very useful (including assessment of possible next steps and SWAGs)

vladsud avatar Apr 23 '24 16:04 vladsud

Is there some write up that tells us where we were, where we are today, and what else we can do to make things better? My main concern - when I did profile matrix 2 months back (at large scale), almost all the time in profiler was handle parsing. Should it be better now? I'll re-meassure, but having some English narrative somewhere with status would be very useful (including assessment of possible next steps and SWAGs)

I'll write down e2e op submit/processing perf benchmarking / optimization for next semester planning.

markfields avatar May 16 '24 13:05 markfields

Closing this in favor of #21102. At this time, not planning to dig into the fuzz test failures. But open to feedback that we should prioritize that investigation. It's hard for me to imagine where we would have a correctness dependency on the LHS here - we've been using the serializer.encode in production for a long time which does the same object walk. And @jzaffiro added serialier.decode calls to the DDS fuzz tests last month, and we aren't seeing problems since then.

markfields avatar May 16 '24 14:05 markfields