frostdb
frostdb copied to clipboard
[DO NOT MERGE] Multi-table transaction support
Took a stab at #721
There's a few rough edged I need to address around locking on aborts, and if we need a write block during aborts. But it's a start for a reference implementation.
How are you avoiding reading incomplete transactions? I'm also curious as to what happens in case of a crash before a transaction fully commits.
In case it's helpful, maybe what I remember from cockroach transaction coordination could be useful:
Each write that forms a part of a single transaction will be added to the LSM in a "pending" state with a pointer to a transaction intent (i.e. intent to commit) in that a central transaction coordinator manages. On reads, any part that points to one of these intents is only returned if the intent is in the "committed" state (that is an atomic flip once the commit()
is called). This allows for all writes to have a linearization point where they are made available for reads. On transaction abort, the intent will be set to the "aborted" state and the participating writes will never have been returned for reads. Intent cleanup would involve first removing the pointers on the parts to the intent (implying a valid commit) in case of commit and simply removing the parts (or even just leaving them there since any reads e.g. for compaction should skip over uncommitted writes) in case of abort.
Read isolation is already taken care of with the watermark. The watermark can never increase past the incomplete transaction and therefore none of the writes can be included in the read.
Crashing before commit is something that I still need to handle with the WAL (I believe there's a TODO in there still). But I don't think our WAL implementation handles writes with the same tx so we'll need to have a prepare/commit message in the wal to indicate partial tx during replay and just drop them if we never got a commit message.