*(ticdc): split old update kv entry after restarting changefeed (#10919)
This is an automated cherry-pick of #10919
What problem does this PR solve?
Issue Number: close #10918
What is changed and how it works?
- Split all kv entry of update event which commit ts is older than the ts when changefeed start;
- Do not split update event inside sink module when the downstream is mysql;
- Restart changefeed when meet duplicate entry error;
- When apply redo log, split update events which update handle key to delete events and insert events, and cache the insert events until all delete events in the same transaction are emitted. If the insert events is too many(larger than 50), events will be written to a temp local file;
Check List
Tests
- Integration test
- Unit test
Questions
Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?
Release note
Fix potential risk of data inconsistency when there are dependencies between update statements in the same transaction.
This cherry pick PR is for a release branch and has not yet been approved by triage owners.
Adding the do-not-merge/cherry-pick-not-approved label.
To merge this cherry pick:
- It must be approved by the approvers firstly.
- AFTER it has been approved by approvers, please wait for the cherry-pick merging approval from triage owners.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: Once this PR has been reviewed and has the lgtm label, please assign amyangfei for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/test verify
/retest
/test cdc-integration-mysql-test
/test cdc-integration-pulsar-test
/retest
@ti-chi-bot: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:
| Test name | Commit | Details | Required | Rerun command |
|---|---|---|---|---|
| pull-cdc-integration-mysql-test | 6eaea1a74181e4980492703bfa3ec03eeeb1d3ad | link | true | /test cdc-integration-mysql-test |
Full PR test history. Your PR dashboard.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.
/test dm-integration-test
/test cdc-integration-mysql-test
/test dm-integration-test
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: asddongmen
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [asddongmen]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
/retest
/test verify
/test cdc-integration-pulsar-test
/test cdc-integration-mysql-test
/test cdc-integration-storage-test
/test verify