Fusion Transactions / batching

Sometimes you need to update multiple state objects at once, for example setting multiple value objects to new values. Because Fusion enforces an internally consistent reactive graph at all times, this means that the 'intermediate' state (where not all objects have been changed yet) has to be computed.

To solve this, it would be useful to introduce a mechanism for deferring updates until all of the objects are changed.

Updates can't be deferred across a yield; we must apply all changes before handing control to code elsewhere in order to maintain our consistency guarantee.

There are two immediately obvious ways to do this. One is to provide start and end functions, which respectively disable and re-enable graph updates (of course, updating the graph if needed). This is simple but also error prone - what if one of those calls are missing?

In my mind it'd be better to use a callback based API, where updates are disabled, then the callback is run, then updates are re-enabled. It's impossible to mess a callback system up as you would with the separated functions.

One concern is how to deal with nested transactions; done naively, these could be error prone. Nested transactions should be safe to treat like no-ops, just invoking the function without any further processing.

Another concern is how to deal with yielding in the callback. This is important for consistency, and so must be dealt with. We could automatically flush updates on yield (essentially splitting the transaction into smaller transactions based on where it yields). It might alternatively be more clear and honest to error on yield.

Perhaps it would be useful to give the user a callback to flush updates too? What's the use case for this? Perhaps this would be most useful for code dealing with transactions more dynamically? Should nested transactions flush instead?

These are all interesting points worth discussing.

Apr 15 '22 10:04 dphfox

It's interesting to think about how this could lead to errors. What if the user calls a function in the transaction which doesn't expect to be called from a transaction? Can we help the user avoid that issue, or is this an unavoidable problem of this system?

Apr 15 '22 10:04 dphfox

Perhaps another solution is to explicitly pass objects to Fusion for them to be handled safely?

local a, b = Value(2), Value("foo") 
batchSet({a, b}, {5, "shoe"})

This avoids the 'calling function from transaction' issue, but how does this interface with third party state objects?

Furthermore, this is a break from the regular way people set values on their objects - is that worth it?

Apr 15 '22 10:04 dphfox

This is only necessary as an alternative to #144. If that's added, it implicitly avoids the intermediate calculation if not required. It also easily solves the second comment regarding errors within the transaction.

Apr 15 '22 12:04 Dionysusnu

This is only necessary as an alternative to #144. If that's added, it implicitly avoids the intermediate calculation if not required. It also easily solves the second comment regarding errors within the transaction.

Its always good to explore the problem space more! Plus, this doesn't actually solve the problem if there's an Observer watching for changes.

Apr 16 '22 07:04 dphfox

This is only necessary as an alternative to #144. If that's added, it implicitly avoids the intermediate calculation if not required. It also easily solves the second comment regarding errors within the transaction.

There are other uses for batch setting, for example if a Computed does an expensive calculation on two Values, you would want to be able to set them both at the same time to only run the calculation once. Currently, you would set one Value, it would run the calculation, then you would set the other Value, and it would run the calculation again. Assuming the Computed is being listened to with an Observer or as a property, lazy states wouldn't matter.

What if the user calls a function in the transaction which doesn't expect to be called from a transaction?

An idea to negate this is to have a separate function for setting, specifically for use inside transactions. The regular Value:set would retain its current functionality — it would instantaneously set the Value. This would allow functions that are not designed to be called inside transactions run as intended. The separate function (I will call it Value:transctionSet here for simplicity), would cache the value you pass in and only set the Value after the callback ends. If a Value is transactionSet multiple times, it would simply overwrite the cache. Essentially, after the callback returns, the last parameter transactionSet was called with for each Value is what that Value gets set to.

Functions intended to be only called inside transactions could use the transactionSet method, which would error when called outside of a transaction.

A function called inside a transaction may itself include a transaction, so nested transactions must be supported. The way I see this is that the nested transaction runs like normal, setting the Values at the end, as would happen when not inside a transaction. Importantly, it should only set the Values that were transactionSet inside the nested transaction, and shouldn't set the Values that were transactionSet before. Every transaction, regardless of nesting, should act as its own thing. A way to do this is to have every transaction have its own cache.

With this model, yielding isn't an issue, as the Values that are transactionSet aren't locked or anything — they simply are set to the transaction's cached value at the end of the callback. Multiple transaction callbacks simultaneously should not interfere with each other.

As for implementation, you could add a private bulk set method, and when the callback ends you call it with the cached parameters for that transaction.

Some code examples to better explain it:

local a = Value("foo")
local b = Value("bar")
Transaction(function()
	a:transactionSet("bar")
	b:transactionSet("foo")
	-- The Values are not actually set until the callback ends
	print(peek(a), peek(b)) --> foo bar

	b:transactionSet("baz")

	-- The :set call works inside of the transaction, but will be replaced when the callback ends
	a:set("foobar")
	print(peek(a)) --> foobar
end)
print(peek(a), peek(b)) --> bar baz

local a = Value("foo")
local b = Value("bar")
-- The transaction call will yield due to the task.wait
Transaction(function()
	a:transactionSet("bar")
	b:transactionSet("foo")

	-- A Transaction acts as a :set, not a :transactionSet
	-- It will temporarily set the Value but will be overwriten after the callback ends
	Transaction(function()
		a:transactionSet("baz")
	end)
	-- b was only :transactionSet inside the outer transaction
	-- It will only update when the outer transaction ends
	print(peek(a), peek(b)) --> baz bar

	-- Yielding doesn't affect anything
	task.wait(1)
	print(peek(a), peek(b)) --> baz bar
end)
print(peek(a), peek(b)) --> bar baz

Jun 24 '23 02:06 funwolf7

Assuming the Computed is being listened to with an Observer or as a property, lazy states wouldn't matter.

This would depend on the implementation taken for #144. It could either be "recalculate immediately if any dependents exist", or "recalculate only when value is requested". My original comment only took the latter option into account, apologies for that. Because with property assignments currently being deferred, the recalculation would only happen at the end of a frame, when all applicable changes would've had a chance to be set in their respective Values. However, I've just seen the Discord message stating that deferred assignment will be getting removed, so that theory does not matter anymore anyways.

I don't think transaction yielding is a good thing to support. The two code examples provided by funwolf7 conflict with the "linearity" of execution, which makes it hard to understand at a glance.

Jun 24 '23 12:06 Dionysusnu

After thinking about this some more, I think that I prefer the batching idea. It falls more in line with what I would personally use it for; I mainly just want to be able to set 2+ Values at once, so that Computeds relying on both only run once. A case could be made where transactions are easier when you want to be able to procedurally choose which Values to bulk update, but for me that seems like a rarer case, and wouldn't be too difficult with a batched system. Batching would be less error-prone and there would be little ambiguity, which is why I prefer it.

Perhaps another solution is to explicitly pass objects to Fusion for them to be handled safely?
local a, b = Value(2), Value("foo") 
batchSet({a, b}, {5, "shoe"})

That being said, I think that this syntax is horrible. It would be extremely difficult to look at it and figure out which property goes with which Value object, especially with many Values, and would be impossible to typecheck well.

A theoretical way would be to use a dictionary, where the key is the Value object, and the value is what to set it to.

local a, b = Value(2), Value("foo")
batchSet{
	[a] = 5,
	[b] = "shoe",
}

This solves the problem of it being difficult to see which Value and property go together, and would be easy to use procedurally, but unfortunately, I don't think it would be possible to typecheck, which is the big killer for me.

A solution that can be typechecked is to have a typechecked function (for demonstration, I will call it Value:batch(), but the name is irrelevant) that generates some readonly information containing the Value and what to set it to, and then you can batch set an array of that information.

local a, b = Value(2), Value("foo")
batchSet({a:batch(5), b:batch("shoe")})

It is easy to tell which property gets set to which Value, and it parallels the syntax for setting Values.

To allow for easier usage of typechecked procedural calls, the type that Value:batch() returns could be public. This would allow you to do something like so:

local valList: {Fusion.Value<number>} = {Value(0), Value(0), Value(0), Value(0), Value(0)}
local batchList: {Fusion.BatchInfo} = {}
for _, val: Fusion.Value<number> in ipairs(valList) do
	table.insert(batchList, val:batch(math.random()))
end
batchSet(batchList)

There are some small questions that would need answering for this. For example, what happens when the same Value is used twice in the same batch set? These aren't that major, however, and wouldn't pose too big of a challenge to answer.

Aug 04 '23 22:08 funwolf7

What about

batchSet {
    { aValue, 2 },
    { bValue, "foo" },
}

This gets the benefits of option 2. Does that work with typechecking? (I've checked that this would work for roblox-ts. The dictionary option 2 unfortunately would not)

Aug 09 '23 00:08 Dionysusnu

batchSet {
    { aValue, 2 },
    { bValue, "foo" },
}

This seems impossible to typecheck as well.

Using an array for combining the Value and property into one would be difficult to typecheck. Luau doesn't support number literals, so there wouldn't be a way to ensure that the first index is a Value and the second index is something that index 1 can be set to. You could use string literals instead, which would work but would be tedious to write many times.

The more important problem is typechecking the outer array. Your only options would be to use an any type, which would make it not give a type error when setting the Value to the incorrect type, or to use a type generic for the function, which would only allow batch sets with Values that are the same type, which would be very restricting.

This is the reason I proposed a function for generating the combined info. By using a function, you can typecheck the creation of the info, rather than having to typecheck it in the function that uses the info, which would be impossible.

Aug 09 '23 01:08 funwolf7

The problem I have with batchSet is that it feels far too specific to be a general function. The problem is generally that spurious or duplicate updates are being allowed to propagate. Attempting to control how a single kind of state object dispatches updates sounds like we're solving a symptom of the problem.

Let me make concrete a counterexample that highlights where this approach falls short. Consider some kind of out prop from an instance - perhaps observing the ancestry of an instance. We have a process that unparents the instance temporarily, takes a brief moment to do some work, then reparents it in a new location. We do not control the change of value here, because the data source is opaquely being managed by Fusion itself. Batching at the site of the direct mutation is not a good idea here, because if we wanted to batch the parent changes together to avoid propagating the temporary nil parent, we would have to reach into internal implementations.

I remain convinced that callbacks provide the best and most general semantics for this specific issue. It still is interesting to me whether some kind of sample-and-hold state object, or similar, could model batching directly as part of the reactive graph; I have not thought too deeply about that aspect.

Aug 13 '23 22:08 dphfox

I guess on that note, related to #52

Aug 13 '23 22:08 dphfox

Thinking deeper about this, assuming the existence of lazy evaluation, the only function of batching is controlling the number of eager path evaluations. This basically means only routes with Observer or Eager objects would be affected. As pointed out previously, this diminishes but does not eliminate the usefulness of batching. So I don't see this as ultimately too important to warrant fundamental or global changes.

Freezing/sample-and-hold makes no fundamental change to Fusion's reactive graph model. It introduces no new external considerations and is completely explicit and localised. It does not suffer the same problems as batching, and the admittedly slight bulk of creating a state object to control the freezing gets amortised quickly in situations where multiple objects need to be frozen. I think I might ultimately prefer this.

Aug 13 '23 22:08 dphfox

Thinking about my past situations where I thought I needed batch setting, I probably could have used a Value that stores a table of information and Computeds to get individual values from it. I think that is what I ended up doing and it worked fine.

That being said, would transactions even work for the example scenario you gave?

We have a process that unparents the instance temporarily, takes a brief moment to do some work, then reparents it in a new location.

With deferred events, the event would likely happen sometime after the callback finishes, meaning that when the Out updates, it will be after things are no longer frozen. In fact, it is likely unnecessary with deferred events, as both events would fire after the Parent is already set to the new Parent.

I am starting to think that maybe this whole idea isn't very useful. Freezing would probably be enough for most purposes, and it would be unambiguous as to what is frozen and what isn't.

Aug 13 '23 23:08 funwolf7

We can't assume we are operating in a deferred events environment as a library intended for general use across places. We should instead look to ensure behaviour can be made consistent across all uses.

I would agree that transactions seem rather unhelpful in addressing these problems generally. I'll probably close this issue for now, just to signal that I prefer the idea of modelling state freezing as part of the reactive graph. I'm totally open to reopening if anyone comes up with any good points that we might have missed.

Aug 15 '23 12:08 dphfox

Fusion Fusion copied to clipboard

Transactions / batching

Fusion
Fusion copied to clipboard