foundationdb
foundationdb copied to clipboard
`loop-choose-when` can sometimes cause priority inversion
In certain cases, a loop-choose-when block can introduce priority inversion.
For example in the following case:
loop choose {
when(T value1 = wait(FutureStream<T>f1)) {
wait(someLock->take());
// do awesome stuff
}
when(T value2 = wait(FutureStream<T>f2)) {
// do even more awesome stuff
}
}
Let's say the f1
's task priority is lower than f2
.
There could be an execution path which causes priority inversion. Consider the following execution:
- At time t_0, because none of
f1
orf2
is ready yet, the program simply registers a callback for each of them and yield the control flow; - At time t_1,
f1
is ready and thus callback for it got fired. The program first removes all other callbacks for this actor then try to grab the lock. - If the lock is not ready, the program registers a callback for the lock and then the control flow will be yield from there. Notice as of now, the only callback registered is for the lock.
- Say after that, the
f2
is ready on the run loop before the nextf1
. But since we don't have a callback here, the only way to know that would be examining if the future is ready by callingf1.isReady()
. - Now if the program resumes the execution when the lock is ready, it will continue executing the
do awesome stuff
. After that the program returns to the head of the loop. Iff1
is ready during the wait on the lock, even it came after thef2
, the program will simply execute its block, and thus repeats from step 2 - 6.
The aforementioned bug actually boils down to:
- A
when
statement with higher delaying order waiting for a low priority run loop task - AND that
when
block contains another wait.
So the users of Flow need to pay special attention when they decided to use a wait
inside the non-tail when
block in a loop-choose-when
.
I didn't see documentations/issues/questions regarding to this and thus want to open this tracking issue for both visibility and possible future solution.
Maybe we can at least emit a warning (console warning message or an actual complier warning or both) from ActorCompiler if it sees this pattern.
What is the definition of priority
in the above example?
Does it mean that wait
in the when
condition should have higher priority than the wait
inside the when
statement?
If we wrap the first when into an actor and use an ActorCollector
to collect the actor, will the problem be solved?
What is the definition of priority in the above example? Does it mean that
wait
in thewhen
condition should have higher priority than thewait
inside the when statement?
By priority I mean the Task Priority
on the run loop. In the scenario described above, the task with a higher priority on the run loop becomes ready before the lower priority task. But actors waiting on it won't be notified. And hence the priority inversion.
Notice this behavior does not break the contract of the choose-when
structure, which is the reason why it's not a bug, but rather just an anti pattern.
If we wrap the first when into an actor and use an
ActorCollector
to collect the actor, will the problem be solved?
What is an ActorCollector
?
@xis19 added a linter to detect this code pattern in https://github.com/apple/foundationdb/pull/9800
I did an audit of the results. There are a few cases of this exact problems. The majority are delay(0)
or wait
before a break
or return
, which are fine.