FluidFramework
FluidFramework copied to clipboard
Simplified model for Audience; Leverage join signal's referenceSequenceNumber for catch up logic
Most important changes:
-
"self" will show up at the same time in Quorum and Audience for "write" connection. This is done by ignoring signals for such connections and duplicating Quorum changes into Audience. This gives clients simpler picture to work with.
- I'd think it makes sense to merge these two data structures in the future, and instead of having 2 of them, provide various adapters / filters that keep usage simple for scenarios where someone needs only "write" clients (as an example).
-
"connected" transition will happen after "self" shows up in Audience. This significantly simplifies programming model for users
- In case of failure (currently defined as 5 seconds of not receiving join signal), we force reconnect. It's hard to say how often it happens and if such recovery is too expensive / has good success rates. If tracking performance of join op any indicator, likely such recovery will not be very successful, but tracking it in telemetry will help us find real bugs in relay service.
- This is the part that I'm least sure about. Planning to test it in ODSP scalability runs. Kill-bit feature gate has been added.
- referenceSequenceNumber from join signal (if provided) is used to determine if client needs to fetch ops from storage to catch up, as well as a data point to know when to raise "connected" event. Later requires enabling existing Fluid.Container.CatchUpBeforeDeclaringConnected gate.
Less important:
- Simplified Protocol creation and initializing / resetting audience on connection by leveraging signal queue for this sequence.
- ISignalClient was lying about its shape - extra properties exist only on ISignalMessage, even though they are applicable only for payloads of ISignalClient shape (i.e., join & leave signals).
- I believes it would be actually more correct to have that payload on ISignalClient as it would ensure clients would get same info on connection for clients already in audience. But for now, I'm reflecting reality.
- Added SignalType to separate it from MessageType. Opened ADO#1986 to track moving existing code to using it.
- Simplified CatchUpMonitor a bit
- Removed some number of unneeded type casts.
Back-compat considerations:
- There should be no back-compat concerns here, as behavior (where changed) was never defined. I.e., order in which clients show up in Audience and Quorum are undefined.
- Similar, order of "connected" transition and presence of "self" in Audience is undefined.