aquadoggo
aquadoggo copied to clipboard
Dispatch `reduce` tasks for all unmaterialized entries during start up
Our materializer has two sorts of "events" which are important to re-attempt when a node quit prematurely to assure we're not losing data:
- Re-attempt tasks
- Re-attempt unmaterialized operations
They seem related but actually are independent from each other: Tasks do not necessarily represent arriving operations. Let's say an operation arrives for the first time, kicks in a reduce
task, followed by a dependency
task. Now the node got shut off before that dependency
task finished. We're sending that operation again on restart to re-attempt that flow, the reduce
task will quit early, saying it already has done its work last time. No dependency
task will be dispatched, we're having a problem and lost data.
This is also true vice-versa: Tasks are handled too late in some race conditions where operations got successfully stored, but the node quit before the reduce
task got created. We've lost data again.
The first point (Tasks) we already solved, but we need to also account for unmaterialized operations. This was not possible until now, since it wasn't easy to distinct in our database if an operation has been materialized or not. Now we have a sorted_index
which represents that state, see: https://github.com/p2panda/aquadoggo/pull/438
On node startup we should check which operations have sorted_index = None
and then issue reduce
tasks for them.