swym icon indicating copy to clipboard operation
swym copied to clipboard

Transaction Ordering

Open catherinejarrett opened this issue 4 years ago • 3 comments

I am working to understand how transactions can possibly interact with each other in swym.  Below is an example that I have been working with.  It seems that it is possible for two transactions to commit out of order with respect to the recorded times in the EpochLocks.  Is this the case?  

If so, is it possible for any other transaction to read/write a set of inconsistent values (for example, to read both A=1 and B=100 in the following example)?  I have been unable to find a scenario in which the "out of order" EpochLock values actually cause any issues; is it correct that the time associated with each EpochLock does not need to be able to be compared to the time in a different EpochLock?

Example: Transaction Tx1 reads B and writes (B+1) to A.  Tx2 writes 100 to B.  Initially, B=1.  If we considered a purely sequential model, there are two possible outcomes: (1) Tx1 executes, then Tx2, and the final values are A=2 and B=100; or (2) Tx2 executes, then Tx1, and the final values are A=101 and B=100.

Now suppose we consider a possible execution order with swym. Tx1 reads B=1, then acquires the write lock for A, and validates the read of B=1 successfully.  Next, Tx2 acquires the write lock for B, writes B=100, and increments EpochClock to 1.  Tx2 then releases the lock on B, publishing 100 to B and setting the time in the EpochLock associated with B to 1.  Next, Tx1 writes A=2, increments EpochClock, and releases the lock, setting the time in the EpochLock associated with A to 2. In this example, the final values are A=2 and B=100, which corresponds to scenario (1) above.  However, the EpochLock times associated with A and B indicate that Tx1 and Tx2 completed in the opposite order.

To summarize, my two questions are: (1) Am I understanding the code correctly that the above situation can indeed occur? (2) Is there any scenario in which this situation (the two transactions seeming to commit in the opposite order) matters for any other transactions?

catherinejarrett avatar Jun 04 '20 22:06 catherinejarrett

This is a really great question!

It seems that it is possible for two transactions to commit out of order with respect to the recorded times in the EpochLocks. Is this the case?

Yep.

I am pretty sure you get this, but just in case, the goal of swym is to have "Single Lock Atomicity" (SLA) style guarantees. Basically the outcome of those two transactions should only be the same as the purely sequential model - A=2, B=100, or A=101, B=100. It appears you agree that these are the only two outcomes when using swym.

What I hear you saying is that the version numbering on TCell's might not reflect the actual order in which values were published.

  1. Yes. You totally get it :).
  2. No. This out of order publishing cannot affect any other transactions. A is unreadable (due to being locked) by any third transaction for the entire duration of Tx2's commit. So this "in between" state is not observable. It is strange that the versions on the TCell's are effectively backwards from the order in which the writes "appear" to have happened using the SLA model. swym doesn't really utilize version information at a more fine grained level than simply asking the question, "was this value set sometime (doesn't matter when) before the transaction started".

In case you find it interesting (I sure do), a transaction with no read log, like Tx2 is unusual, and effectively has no constraints on it. In theory swym could guarantee that these transactions never fail, however, the commit operation may need a couple retries (due to lock acquiring failing), but the user code would only need to be run once.

Thanks for the great question!

mtak- avatar Jun 11 '20 19:06 mtak-

If that doesn't clear things up, feel free to ask follow up questions.

mtak- avatar Jun 11 '20 19:06 mtak-

Maybe looking at some cases for a Tx3 would also help.

Tx3 starts when EpochClock=0: It can only read values A=start value, B=1

Tx3 starts when EpochClock=2: It can only read values A=2, B=100

nontrivial case Tx3 starts when EpochClock=1: It can only read B=100, and A is totally unreadable due to the lock being held on A, or A having a version number of 2. Therefore this transaction is doomed to retry.

mtak- avatar Jun 11 '20 20:06 mtak-