tiflow
tiflow copied to clipboard
cdc sink reports "Write Conflict" error and causes checkpointTs lags back
What did you do?
From the oncall, There has always been 3 CDC nodes, none was added or removed. The changefeed never actually catches up as one of the CDC nodes is always stuck. It is pausing/resuming the feed that causes the stuck CDC node to change and have this flapping behaviour.
BTW, we find unexpected 'write conflict' in one cdc node log.
log:
[2022/09/23 07:28:36.822 +00:00] [WARN] [mysql.go:612] ["execute DMLs with error, retry later"] [error="
[CDC:ErrMySQLTxnError]MySQL txn error: Error 9007: Write conflict, txnStartTS=436186150766379010, conflictStartTS=436186150727057454, conflictCommitTS=436186150766379021,
key={tableID=172, indexID=4, indexValues={1, 85709, 15, 13, }} primary={tableID=161, indexID=1, indexValues={346943, 1850395608943165440, 729430118, 1, 1598468504, }} [try again later]"]
What did you expect to see?
No response
What did you see instead?
--
Versions of the cluster
Upstream TiDB cluster version (execute SELECT tidb_version();
in a MySQL client):
6.1.0
Upstream TiKV version (execute tikv-server --version
):
6.1.0
TiCDC version (execute cdc version
):
v6.1.0
key={tableID=172, indexID=4, indexValues={1, 85709, 15, 13, }} primary={tableID=161, indexID=1, indexValues={346943, 1850395608943165440, 729430118, 1, 1598468504, }}
since key.tableID is different from primary.tableID, which means a multi-table transaction , some writing-operations by the user in the downstream are expected .
key={tableID=172, indexID=4, indexValues={1, 85709, 15, 13, }} primary={tableID=161, indexID=1, indexValues={346943, 1850395608943165440, 729430118, 1, 1598468504, }}
since key.tableID is different from primary.tableID, which means a multi-table transaction , some writing-operations by the user in the downstream are expected .
Are you saying that there are write operations from applications going on on the target cluster? Or is this multiple operations from TiCDC that are conflicting among itself? Would it be possible to print/find the connection ID's from the conflicting transactions?
Would it be possible to:
- show table names (and index/object?)
- show a human readable logical/physical part of the timestamps?
- have a more detailed message if
key.tableID
andprimary.tableID
are not the same? - have information about the involved threads?
key={tableID=172, indexID=4, indexValues={1, 85709, 15, 13, }} primary={tableID=161, indexID=1, indexValues={346943, 1850395608943165440, 729430118, 1, 1598468504, }} since key.tableID is different from primary.tableID, which means a multi-table transaction , some writing-operations by the user in the downstream are expected .
Are you saying that there are write operations from applications going on on the target cluster? Or is this multiple operations from TiCDC that are conflicting among itself? Would it be possible to print/find the connection ID's from the conflicting transactions?
Yes. The transaction created by the ticdc will change the rows only in one table.
/close
@nongfushanquan: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.