tiflow icon indicating copy to clipboard operation
tiflow copied to clipboard

dm dump phase has significant performance impact for upstream mysql

Open glorv opened this issue 3 years ago • 3 comments

What did you do?

migration 20 sharding tables from mysql with each about 20GB. The dump config is:

mydumpers:      
  global:
    threads: 1
    chunk-filesize: 64
    extra-args: "--consistency none -r 10000"

I expected the dumpling phase should not impact the source mysql at all. But the write latency increased consistantly after dm task started.

What did you expect to see?

No response

What did you see instead?

No response

Versions of the cluster

DM version (run dmctl -V or dm-worker -V or dm-master -V):

(paste DM version here, and you must ensure versions of dmctl, DM-worker and DM-master are same)

Upstream MySQL/MariaDB server version:

(paste upstream MySQL/MariaDB server version here)

Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

(paste TiDB cluster version here)

How did you deploy DM: tiup or manually?

(leave TiUP or manually here)

Other interesting information (system version, hardware config, etc):

>
>

v2.0.1

current status of DM cluster (execute query-status <task-name> in dmctl)

(paste current status of DM cluster here)

glorv avatar Jan 20 '22 10:01 glorv

@lichunzhu finds a good blog to explain the reason https://plaid.com/blog/exploring-performance-differences-between-amazon-aurora-and-vanilla-mysql/

So we are plan to use multiple shorter transaction to SELECT and dump the data, and after dump finishes, we record the exit position for DM safe mode.

Also it's better that dumpling can detect column changing during the dump and restart the whole dump process.

lance6716 avatar Jan 27 '22 02:01 lance6716

https://github.com/pingcap/docs-cn/pull/9136

We are adding a document that guide the user to create a new Aurora from snapshot (not a slave Aurora), as a walkaround. So lower the severity.

lance6716 avatar Jun 16 '22 06:06 lance6716

When we provide an end to end cloud data migration service, this problem should be assigned a higher priority @niubell

lance6716 avatar Sep 30 '22 02:09 lance6716