tidb
tidb copied to clipboard
doc: lazy constraint check in pessimistic txn
What problem does this PR solve?
Issue Number: ref #36579
Check List
Tests
- [ ] Unit test
- [ ] Integration test
- [ ] Manual test (add detailed scripts or steps below)
- [x] No code
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.
None
[REVIEW NOTIFICATION]
This pull request has been approved by:
- cfzjywxk
- ekexium
To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.
The full list of commands accepted by this bot can be found here.
Reviewer can indicate their review by submitting an approval review. Reviewer can cancel approval by submitting a request changes review.
cc @ekexium @TonsnakeLin @cfzjywxk
Code Coverage Details: https://codecov.io/github/pingcap/tidb/commit/a197163ea7165467fe83c34dd1a0f31139b5db12
Shall we add a section to demonstrate the safety of the change since the constraint check mechanism is subtle and vulnerable?
~~I am somewhat convinced of the safety of write: writes won't break the integrity of data in the storage. While the safety of read, namely read statements won't return results that break the constraint, may require further consideration.~~
A case similar to that in #24195, if we enable the feature:
create table t2 (id int primary key, uk int, unique key i1(uk))
insert into t2 values (1, 1)
set @@tidb_txn_assertion_level=off
begin pessimistic
insert into t2 values (2, 1)
select * from t2 use index(primary) for update -- same result whether for update or not
delete from t2 where id = 1
commit
admin check table t2
The select result is 2 rows (1, 1) and (2, 1), which breaks the uniqueness constraint. While the final commit can succeed. And the admin check will report data-index inconsistency. BTW, assertion could prevent the txn from committing.
"Acquisition of lock always resolves constraint check" doesn't help. The feature sort of enlarges the anomaly
For reasoning its safety, I'm considering what if we allow an arbitrary mix of optimistic and pessimistic mutations in one transaction? Will it break any safety property?
For reasoning its safety, I'm considering what if we allow an arbitrary mix of optimistic and pessimistic mutations in one transaction? Will it break any safety property?
Pessimistic mutations are inheritantly no different from optimistic mutations after they are turned into 2pc locks in prewrite, so I think it is very unlikely that this will break any properties.
Instead, I worry more about the handling about union scan and generating the mutations. It's more complicated when the result set sometimes break constraints but the transaction can commit in the end.
the handling about union scan and generating the mutations
There're several cases we may need to design and handle carefully, for example:
delete-your-writefrom which we've encountered data inconsistency issues before. The inserted content may not need to be committed but constraint checks are still required.- Report
duplicateerror in time and avoid further weird behaviors considering theunionScanexecutor. - The transactional key flags and statement level key flags processing which are easy to make mistakes.
When a key that has its constraint check deferred in a previous statement gets modified, we have 2 options:
-
Lock it as we did before, resolve the constraint check and unset the defer flag Then we need some kind of a "once flag" mechanism, considering the possibility that a statement may involve multiple operations on a key. Any modifications to the key (even in the same statement) will unset the flag.
-
Skip the lock. The reason is the same as the one that let us skip the constraint check: if the user believes the key is less likely to have conflicts, why don't we skip the lock as well? It doesn't affect the success rate after all. It requires
PresumeNotExistflags on more than PUT mutations. An extra Op like DeleteMustExist needs to be added, at least.
2. Skip the lock. The reason is the same as the one that let us skip the constraint check: if the user believes the key is less likely to have conflicts, why don't we skip the lock as well? It doesn't affect the success rate after all. It requires
PresumeNotExistflags on more than PUT mutations. An extra Op like DeleteMustExist needs to be added, at least.
What I'm not sure about the second option is the risk of how we handle a result set with broken constraints. And you have found an example above that there can be inconsistency unless we add new special ops. There could be other possible issues that we haven't found.
There could be other possible issues that we haven't found.
Yes, I'm pursuing proof of the safety of this change, whichever options we take.
select * from t2 use index(primary) for update -- same result whether for upda
@ekexium
Does it happen because the select for update statement does not acquire the pessimistic lock on the unique key but just the row key? Actually, the unique key should also be locked though it's not changed as described in the issue.
It seems that if the row key and unique key are always locked Acquisition of lock always resolves constraint check should be able to work.
select * from t2 use index(primary) for update -- same result whether for upda@ekexium Does it happen because the
select for updatestatement does not acquire the pessimistic lock on the unique key but just the row key? Actually, the unique key should also be locked though it's not changed as described in the issue. It seems that if the row key and unique key are always lockedAcquisition of lock always resolves constraint checkshould be able to work.
@cfzjywxk The problem exists in the implementation of option 2 mentioned above (so neither row key nor unique key is locked in the select for update and delete). Option 1 is free from this issue.
Actually, the unique key should also be locked though it's not changed as described in the https://github.com/pingcap/tidb/issues/36438.
For option 1, would this still be a problem if the lock on a unique key is deferred and then the select for update does not lock it as expected because of the issue 36438 so the constraint resolving does not work on this unique key?
Actually, the unique key should also be locked though it's not changed as described in the #36438.
For option 1, would this still be a problem if the lock on a unique key is deferred and then the
select for updatedoes not lock it as expected because of the issue 36438 so theconstraint resolvingdoes not work on this unique key?
I think so. I mean it's free from data-index inconsistency. The seemingly invalid select result is an old issue. It's better to fix it, while our bottom line is that we guarantee the constraint satisfaction of read results only if the transaction successfully commits.
/merge
This pull request has been accepted and is ready to merge.
TiDB MergeCI notify
| CI Name | Result | Duration | Compare with Parent commit |
|---|---|---|---|
| idc-jenkins-ci/integration-cdc-test | 🟢 all 36 tests passed | 54 min | Existing passed |
| idc-jenkins-ci-tidb/tics-test | 🟢 all 1 tests passed | 18 min | Existing passed |
| idc-jenkins-ci-tidb/integration-common-test | 🟢 all 17 tests passed | 12 min | Existing passed |
| idc-jenkins-ci-tidb/common-test | 🟢 all 11 tests passed | 12 min | Existing passed |
| idc-jenkins-ci-tidb/integration-ddl-test | 🟢 all 6 tests passed | 5 min 51 sec | Existing passed |
| idc-jenkins-ci-tidb/sqllogic-test-2 | 🟢 all 28 tests passed | 5 min 31 sec | Existing passed |
| idc-jenkins-ci-tidb/sqllogic-test-1 | 🟢 all 26 tests passed | 4 min 18 sec | Existing passed |
| idc-jenkins-ci-tidb/mybatis-test | 🟢 all 1 tests passed | 3 min 43 sec | Existing passed |
| idc-jenkins-ci-tidb/integration-compatibility-test | 🟢 all 1 tests passed | 3 min 12 sec | Existing passed |
| idc-jenkins-ci-tidb/plugin-test | 🟢 build success, plugin test success | 4min | Existing passed |