seatunnel icon indicating copy to clipboard operation
seatunnel copied to clipboard

[Bug] [connector-jdbc] mysql to mysql slow

Open samgang opened this issue 1 year ago • 6 comments

Search before asking

  • [X] I had searched in the issues and found no similar issues.

What happened

The speed of data synchronization from MySQL to MySQL is very slow. So how should we improve the synchronization speed?

Also, is there a preSql function? such as deleting target table data before synchronization

SeaTunnel Version

apache-seatunnel-incubating-2.3.1

SeaTunnel Config

env {
  execution.parallelism = 1
  job.mode = "BATCH"
}

source {
    Jdbc {
        url = "jdbc:mysql://localhost:3306/test"
        driver = "com.mysql.cj.jdbc.Driver"
        connection_check_timeout_sec = 100
        user = "xxxxxx"
        password = "xxxxxx"
        query = "select id,name from t1"
        partition_column = "id"
        partition_num = 10
    }
}

sink {
  jdbc {
    url = "jdbc:mysql://localhost:3306/test"
    driver = "com.mysql.cj.jdbc.Driver"
    user = "xxxxxx"
    password = "xxxxxx"
    query = "INSERT INTO t2(id,name) VALUES (?,?)"
  }
}

Running Command

./bin/seatunnel.sh --config test2.conf  -m local

Error Exception

Read Count So Far         :              282048
Write Count So Far        :              279999
Average Read Count        :               233/s
Average Write Count       :               233/s
Last Statistic Time       : 2023-04-24 17:33:30
Current Statistic Time    : 2023-04-24 17:34:30

Read Count So Far         :              295048
Write Count So Far        :              292999
Average Read Count        :               216/s
Average Write Count       :               216/s
Last Statistic Time       : 2023-04-24 17:34:30
Current Statistic Time    : 2023-04-24 17:35:30

Flink or Spark Version

no

Java or Scala Version

1.8

Screenshots

No response

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

samgang avatar Apr 24 '23 09:04 samgang

jdbc url add the option rewriteBatchStatements=true

TyrantLucifer avatar Apr 24 '23 15:04 TyrantLucifer

jdbc url add the option rewriteBatchStatements=true

Thanks, but rewriteBatchedStatements=true is ok for me.

samgang avatar Apr 25 '23 01:04 samgang

try set execution.parallelism=10

EricJoy2048 avatar Apr 26 '23 07:04 EricJoy2048

try set execution.parallelism=10

Thanks, It seems that the improvement is not significant.

  1. add rewriteBatchedStatements=true and set execution.parallelism=1
Read Count So Far         :             5502980
Write Count So Far        :             5500931
Average Read Count        :             11292/s
Average Write Count       :             11292/s
Total Time(s)             :                 532
Total Read Count          :             6005032
Total Write Count         :             6005032
  1. add rewriteBatchedStatements=true and set execution.parallelism=10
Read Count So Far         :             5657869
Write Count So Far        :             5655820
Average Read Count        :             11723/s
Average Write Count       :             11723/s
Total Time(s)             :                 516
Total Read Count          :             6005032
Total Write Count         :             6005032

samgang avatar Apr 26 '23 07:04 samgang

I had the same problem, the version is the same as his, but I added rewriteBatchStatements=true in the url, which did not solve it

`env { job.mode = "BATCH", execution.parallelism = 10 } source { Jdbc { url = "jdbc:mysql://192.168.0.4:3307/sql?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "root" query = "select * from test1" } }

sink { Jdbc { url = "jdbc:mysql://192.168.0.4:3307/sql?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "root" query = "insert into test1_copy2(brand,stt,edt) values(?,?,?)" } }`

` Job Progress Information


Job Id : 711925978242220033 Read Count So Far : 6048 Write Count So Far : 3999 Average Read Count : 100/s Average Write Count : 66/s Last Statistic Time : 2023-05-19 20:59:42 Current Statistic Time : 2023-05-19 21:00:42


`

shannxisj avatar May 19 '23 13:05 shannxisj

This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jul 13 '23 00:07 github-actions[bot]

This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.

github-actions[bot] avatar Jul 23 '23 00:07 github-actions[bot]

I tried all the methods metioned above, and it was still very slow. mysql to postgre. My setting looks like: `env { execution.parallelism = 10 job.mode = "BATCH" }

source{ Jdbc { url = "jdbc:mysql://host:port/test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 3600 user = "root" password = "xxxxx" query = "select * from test" fetch_size = 10000 } }

sink { Jdbc { url = "jdbc:postgresql://host:port/test?rewriteBatchStatements=true" driver = "org.postgresql.Driver" connection_check_timeout_sec = 3600 user = "root" password = "xxxx" database = "test" table = "ods.test" generate_sink_sql = true primary_keys = ["id"] compatible_mode = "postgresLow" batch_size = 10000 } }`

Yanhuanjin avatar Dec 15 '23 08:12 Yanhuanjin

MySQL-MySql very slow I tried the following configuration and encountered the same problem, and couldn't solve it. My setting looks like: env { #execution.parallelism = 2 job.mode = "BATCH" #checkpoint.interval = 100000 }

source { Jdbc { url = "jdbc:mysql://172.24.4.210:3306/performance_test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "root" password = "root" query = "select * from performance_1qw_copy limit 200000" parallelism = 10 } }

sink { Jdbc { url = "jdbc:mysql://172.24.4.159:3306/test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "L2s02K2mFi@#" generate_sink_sql = true database = test table = performance_1qw_copy } }

2024-01-31 14:07:25,506 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -


       Job Progress Information

Job Id : 804955400704098305 Read Count So Far : 21048 Write Count So Far : 18999 Average Read Count : 152/s Average Write Count : 152/s Last Statistic Time : 2024-01-31 14:06:25 Current Statistic Time : 2024-01-31 14:07:25


Aiden-Rose avatar Jan 31 '24 06:01 Aiden-Rose