seatunnel
seatunnel copied to clipboard
[Bug] [connector-jdbc] mysql to mysql slow
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
The speed of data synchronization from MySQL to MySQL is very slow. So how should we improve the synchronization speed?
Also, is there a preSql function? such as deleting target table data before synchronization
SeaTunnel Version
apache-seatunnel-incubating-2.3.1
SeaTunnel Config
env {
execution.parallelism = 1
job.mode = "BATCH"
}
source {
Jdbc {
url = "jdbc:mysql://localhost:3306/test"
driver = "com.mysql.cj.jdbc.Driver"
connection_check_timeout_sec = 100
user = "xxxxxx"
password = "xxxxxx"
query = "select id,name from t1"
partition_column = "id"
partition_num = 10
}
}
sink {
jdbc {
url = "jdbc:mysql://localhost:3306/test"
driver = "com.mysql.cj.jdbc.Driver"
user = "xxxxxx"
password = "xxxxxx"
query = "INSERT INTO t2(id,name) VALUES (?,?)"
}
}
Running Command
./bin/seatunnel.sh --config test2.conf -m local
Error Exception
Read Count So Far : 282048
Write Count So Far : 279999
Average Read Count : 233/s
Average Write Count : 233/s
Last Statistic Time : 2023-04-24 17:33:30
Current Statistic Time : 2023-04-24 17:34:30
Read Count So Far : 295048
Write Count So Far : 292999
Average Read Count : 216/s
Average Write Count : 216/s
Last Statistic Time : 2023-04-24 17:34:30
Current Statistic Time : 2023-04-24 17:35:30
Flink or Spark Version
no
Java or Scala Version
1.8
Screenshots
No response
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
jdbc url add the option rewriteBatchStatements=true
jdbc url add the option
rewriteBatchStatements=true
Thanks, but rewriteBatchedStatements=true
is ok for me.
try set execution.parallelism=10
try set
execution.parallelism=10
Thanks, It seems that the improvement is not significant.
- add
rewriteBatchedStatements=true
and setexecution.parallelism=1
Read Count So Far : 5502980
Write Count So Far : 5500931
Average Read Count : 11292/s
Average Write Count : 11292/s
Total Time(s) : 532
Total Read Count : 6005032
Total Write Count : 6005032
- add
rewriteBatchedStatements=true
and setexecution.parallelism=10
Read Count So Far : 5657869
Write Count So Far : 5655820
Average Read Count : 11723/s
Average Write Count : 11723/s
Total Time(s) : 516
Total Read Count : 6005032
Total Write Count : 6005032
I had the same problem, the version is the same as his, but I added rewriteBatchStatements=true in the url, which did not solve it
`env { job.mode = "BATCH", execution.parallelism = 10 } source { Jdbc { url = "jdbc:mysql://192.168.0.4:3307/sql?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "root" query = "select * from test1" } }
sink { Jdbc { url = "jdbc:mysql://192.168.0.4:3307/sql?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "root" query = "insert into test1_copy2(brand,stt,edt) values(?,?,?)" } }`
` Job Progress Information
Job Id : 711925978242220033 Read Count So Far : 6048 Write Count So Far : 3999 Average Read Count : 100/s Average Write Count : 66/s Last Statistic Time : 2023-05-19 20:59:42 Current Statistic Time : 2023-05-19 21:00:42
`
This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.
I tried all the methods metioned above, and it was still very slow. mysql to postgre. My setting looks like: `env { execution.parallelism = 10 job.mode = "BATCH" }
source{ Jdbc { url = "jdbc:mysql://host:port/test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 3600 user = "root" password = "xxxxx" query = "select * from test" fetch_size = 10000 } }
sink { Jdbc { url = "jdbc:postgresql://host:port/test?rewriteBatchStatements=true" driver = "org.postgresql.Driver" connection_check_timeout_sec = 3600 user = "root" password = "xxxx" database = "test" table = "ods.test" generate_sink_sql = true primary_keys = ["id"] compatible_mode = "postgresLow" batch_size = 10000 } }`
MySQL-MySql very slow I tried the following configuration and encountered the same problem, and couldn't solve it. My setting looks like: env { #execution.parallelism = 2 job.mode = "BATCH" #checkpoint.interval = 100000 }
source { Jdbc { url = "jdbc:mysql://172.24.4.210:3306/performance_test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "root" password = "root" query = "select * from performance_1qw_copy limit 200000" parallelism = 10 } }
sink { Jdbc { url = "jdbc:mysql://172.24.4.159:3306/test?rewriteBatchStatements=true" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "L2s02K2mFi@#" generate_sink_sql = true database = test table = performance_1qw_copy } }
2024-01-31 14:07:25,506 INFO org.apache.seatunnel.engine.client.job.JobMetricsRunner -
Job Progress Information
Job Id : 804955400704098305 Read Count So Far : 21048 Write Count So Far : 18999 Average Read Count : 152/s Average Write Count : 152/s Last Statistic Time : 2024-01-31 14:06:25 Current Statistic Time : 2024-01-31 14:07:25