gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

address inconsistent problems during concurrent update particialy

Open SmartKeyerror opened this issue 1 year ago • 0 comments
trafficstars

There are 2 commits, please review them respectively.


The first commit is quite simple, trying to fix the issue of #17176, just changing the 32-bit integer to 64-bit.


The second commit attempts to resolve the remaining problem of #9452, which has not been resolved the problem of inconsistent snapshots during the concurrent update, as #17134 shows.

The solution is very simple: since Greenplum does not have a global tuple lock mechanism, local transactions handle lock-related content, which will destroy the global feature of 2PL, as shown in the diagram below:

image

In the illustration, the black arrow represents T1, and the blue arrow represents T2. Arrows pointing from QD to QE indicate commands sent from QD and executed by QE, while arrows pointing from QE to QD indicate QE executing the commands and returning the results to QD. Time flows from left to right.

Both T1 and T2 attempt to update the same Tuple.

As observed, T1 executes before T2. When T1 completes the prepare phase of the two-phase commit, T2 begins its update. Since T1 has not yet committed, T2 will wait for T1 to finish (Wait Tuple Lock). When QD sends the commit command of the two-phase commit to QE, and T1 finishes locally, returning the results to QD, the Tuple lock is released. T2 then acquires the Tuple Lock and updates it.

However, the return of results from T1 to QD is relatively slow, while T2 executes faster than T1. When T2 returns to QD and attempts to obtain a distributed snapshot, the distributed transaction state of T1 remains uncommitted. Consequently, incorrect snapshot information is obtained, leading to erroneous visibility determination when executing the SELECT statement on QE, resulting in incorrect data being returned.

For strictly distributed 2PL, transaction T2 must wait until transaction T1, a distributed transaction, has globally committed before it can proceed to execute other statements within the transaction. This PR precisely achieves this strictness.

image

However, this PR doesn't resolve all the problems in the case of #17134, It needs more work to do, but we can divide the big sophisticated problem into tiny ones, and fix them respectively.

SmartKeyerror avatar Apr 17 '24 07:04 SmartKeyerror