tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

DM precheck failed to check GTID when stop and then start from checkpoint with same task config

Open D3Hunter opened this issue 4 months ago • 2 comments

What did you do?

below is the steps I summarized from a user feedback, I hasn't reproduce it locally, but from the code seems it will fail

  • start a dm incr task with explicit GTID
  • run for a while, then purge some binlogs, and run another while
  • stop task, and checkpoint is kept
  • start again with same config, this time we will start from checkpoint actually
  • precheck "meta_position" failed with "ERROR 1236 (HY000): The slave is connecting using CHANGE MASTER TO MASTER_AUTO_POSITION = 1, but the master has purged binary logs containing GTIDs that the slave requires."

in precheck, we are always uses the position in task config, but since we are recover from checkpoint, we should uses the position in checkpoint instead as that's checkpoint position is the real position we will use, and might also need to consider this issue https://github.com/pingcap/dm/issues/1418 https://github.com/pingcap/tiflow/blob/9062d7c0cac6839d157379c38785f55fe2c3275a/dm/pkg/checker/binlog.go#L356-L368

What did you expect to see?

No response

What did you see instead?

No response

Versions of the cluster

dm version 7.5.3, upstream unknown right now,

current status of DM cluster (execute query-status <task-name> in dmctl)

(paste current status of DM cluster here)

D3Hunter avatar Oct 11 '24 08:10 D3Hunter