tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

Tracking issue for splitting big transaction

Open CharlesCheung96 opened this issue 2 years ago • 1 comments

Is your feature request related to a problem?

For CDC, a large transaction is expressed as "a single transaction containing a large number of row-level KV changes". In the current data flow link (puller --> sorter --> mounter --> sink), the sorter is equivalent to an infinite reservoir, and its output transaction events can be formalized as pre_resolved_ts, rawkv_1, rawkv_2,rawkv_3 ...... rawkv_n, resolved_ts The following problems may occur in the case where n takes a large value.

  1. OOM problem: Sink flushes data downstream only when it receives resolved_ts, so multiple change events will be piled up in memory; meanwhile, since mounter decodes rawkv events as RowChangedEvent, it will further increase memory consumption.
  2. Sink latency problem: In the current implementation, MysqlSink reorganizes the above change events into one transaction based on the StartTs of RowChangedEvent; for a single transaction, MysqlSink uses only one worker to write downstream in order to ensure the ACID characteristics of single-table transactions, which limits the throughput of the sink module in large transaction scenarios.

Describe the feature you'd like

Solve OOM and latency problems by splitting transactions and writing to downstream in multiple batches.

Related Issues

  • [x] https://github.com/pingcap/tiflow/pull/5203
  • [x] #1683
  • [x] #5280
  • [x] #5398
  • [x] #5453
  • [x] Make mq sink support split transactions
  • [ ] Make redo log compatible with the transaction splitting mechanism
  • [x] Add integration tests to cover big transaction scenario
  • [x] add split-txn switcher

CharlesCheung96 avatar Apr 21 '22 07:04 CharlesCheung96

/label affects-6.1

nongfushanquan avatar Jul 06 '22 09:07 nongfushanquan