pipeline migration: why use sql update(only include changed column) in increment task
Feature Request
In the incremental processing stage, the system extracts the latest changes from row-based binlog events, identifies the modified columns, and generates the corresponding UPDATE statement.
A more elegant approach would be to convert it into an INSERT ... ON DUPLICATE KEY UPDATE statement.(override all column)
or at lease update all columns。
Here is a problem?
Data Synchronization Issue with Timestamp Columns
The source table contains a column gmt_modified (datetime type) that automatically updates:
gmt_modified datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
Current Behavior: When the source table updates a row where the new gmt_modified value remains identical to the previous value (within second precision), the pipeline's sink operation:
Generates an UPDATE SQL statement that excludes the gmt_modified column
So gmt_modified will be updated by current time not from source table.
Result: The target table's gmt_modified value does not match the source table's value.