DataX
DataX copied to clipboard
任务正常启动后,数据写入一直不变,也不报错
任务正常启动后,数据读写速度都是0,也不报错,任务一直在跑,也不停止。
2020-02-28 10:50:01.195 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:50:11.196 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:50:21.197 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:50:41.198 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:50:51.198 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:51:01.199 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:51:21.200 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:51:41.201 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:51:51.201 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:52:01.202 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00% 2020-02-28 10:52:21.203 [job-0] INFO StandAloneJobContainerCommunicator - Total 3411 records, 3249695540 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 213.975s | All Task WaitReaderTime 0.918s | Percentage 0.00%
因为无法截图,只能粘贴日志出来了,已经执行了十几个小时了,一直都是这样的日志。这张表的数据量八万多。配置如下: { "job": { "setting": { "speed": { "channel": 5 }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "", "password": "", "splitPk": "task_id", "column": [ "task_id", "task_type", "user_id", "status", "create_time", "update_by", "update_time", "task_small_type", "pbi_id", "method_name", "url", "start_time", "end_time", "download_url", "count", "message", "delete_time", "param", "jetlag", "filename" ], "connection": [ { "table": [ "" ], "jdbcUrl": [ "jdbc:mysql://:3306/?autoReconnect=true&allowMultiQueries=true&characterEncoding=utf8" ] } ] } }, "writer": { "name": "oraclewriter", "parameter": { "username": "*****", "password": "", "column": [ "TASK_ID", "TASK_TYPE", "USER_ID", "STATUS", "CREATE_TIME", "UPDATE_BY", "UPDATE_TIME", "TASK_SMALL_TYPE", "PBI_ID", "METHOD_NAME", "URL", "START_TIME", "END_TIME", "DOWNLOAD_URL", "COUNT", "MESSAGE", "DELETE_TIME", "PARAM", "JETLAG", "FILENAME" ], "batchSize": 1024, "preSql": [ "truncate table " ], "connection": [ { "jdbcUrl": "jdbc:oracle:thin:@***/***", "table": [ "*****" ] } ] } } } ] } }
我也有同样问题。求解。我是从tidb导入tidb,不过数据量太大,一般跑到400W时出现速度为0的空跑现象。起初设置调大过内存,但是好像并不同有效果。
2020-02-28 18:47:24.267 [job-0] INFO StandAloneJobContainerCommunicator - Total 4639712 records, 966677003 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 2,167.453s | All Task WaitReaderTime 101.420s | Percentage 0.00 % 2020-02-28 18:47:44.267 [job-0] INFO StandAloneJobContainerCommunicator - Total 4639712 records, 966677003 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 2,167.453s | All Task WaitReaderTime 101.420s | Percentage 0.00 % 2020-02-28 18:47:44.268 [job-0] INFO VMInfo - [delta cpu info] => curDeltaCpu | averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00% | -1.00%
[delta memory info] =>
NAME | used_size | used_percent | max_used_size
| max_percent
PS Eden Space | 70.70MB | 24.21% | 294.05MB
| 99.68%
Code Cache | 8.80MB | 97.10% | 8.90MB
| 98.78%
Compressed Class Space | 1.86MB | 92.85% | 1.86MB
| 92.85%
PS Survivor Space | 3.65MB | 15.89% | 19.91MB
| 97.12%
PS Old Gen | 15.14MB | 2.22% | 15.14MB
| 2.22%
如果是mysql,有可能是发送缓存太小把缓存炸了. 尝试这个: com.alibaba.datax.plugin.rdbms.reader.util.ReaderSplitUtil 71行 把系数5改大,然后把channel调的小一些. 如果dba愿意配合你,你也可以让dba把mysql发送缓存改大
我也有同样问题。求解。我是从tidb导入tidb,不过数据量太大,一般跑到400W时出现速度为0的空跑现象。起初设置调大过内存,但是好像并不同有效果。
2020-02-28 18:47:24.267 [job-0] INFO StandAloneJobContainerCommunicator - Total 4639712 records, 966677003 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 2,167.453s | All Task WaitReaderTime 101.420s | Percentage 0.00 % 2020-02-28 18:47:44.267 [job-0] INFO StandAloneJobContainerCommunicator - Total 4639712 records, 966677003 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 2,167.453s | All Task WaitReaderTime 101.420s | Percentage 0.00 % 2020-02-28 18:47:44.268 [job-0] INFO VMInfo - [delta cpu info] => curDeltaCpu | averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00% | -1.00% [delta memory info] => NAME | used_size | used_percent | max_used_size | max_percent PS Eden Space | 70.70MB | 24.21% | 294.05MB | 99.68% Code Cache | 8.80MB | 97.10% | 8.90MB | 98.78% Compressed Class Space | 1.86MB | 92.85% | 1.86MB | 92.85% PS Survivor Space | 3.65MB | 15.89% | 19.91MB | 97.12% PS Old Gen | 15.14MB | 2.22% | 15.14MB | 2.22%
问题解决了吗,大数据量好像在跑,但是不写日志,也没结束
任务正常启动后,数据读写速度都是0,也不报错,任务一直在跑,也不停止。 请问,后来怎么解决的呢
看日志里面,All Task WaitWriterTime 2,167.453 说明tidb写入性能出现问题
我也遇到了类似的问题:mysql-to-hdfs,数据量比较大,跑一段时间后发现 Speed 0B/s。
通过 jstack 发现存在 reader 线程卡在了 close connection。
"0-0-2-reader" #25 prio=5 os_prio=0 tid=0x00007f73ac012000 nid=0x59 runnable [0x00007f73be59b000]
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:140)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:173)
- locked <0x00000005c0785458> (a com.mysql.jdbc.util.ReadAheadInputStream)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2911)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3387)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1928)
at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:378)
at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:358)
at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:158)
- locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:6702)
- locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:842)
at com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2354)
- locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1536)
at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4262)
at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1510)
- locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:498)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:505)
at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:225)
at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81)
at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57)
at java.lang.Thread.run(Thread.java:748)
这个有解决了吗 @wangzili064
没有解决
---原始邮件--- 发件人: "阿丸"<[email protected]> 发送时间: 2020年11月19日(周四) 下午4:36 收件人: "alibaba/DataX"<[email protected]>; 抄送: "Mention"<[email protected]>;"wangzili064"<[email protected]>; 主题: Re: [alibaba/DataX] 任务正常启动后,数据写入一直不变,也不报错 (#601)
这个有解决了吗 @wangzili064
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
没有解决 … ---原始邮件--- 发件人: "阿丸"<[email protected]> 发送时间: 2020年11月19日(周四) 下午4:36 收件人: "alibaba/DataX"<[email protected]>; 抄送: "Mention"<[email protected]>;"wangzili064"<[email protected]>; 主题: Re: [alibaba/DataX] 任务正常启动后,数据写入一直不变,也不报错 (#601) 这个有解决了吗 @wangzili064 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
太难了。。。
我也遇到了类似的问题:mysql-to-hdfs,数据量比较大,跑一段时间后发现 Speed 0B/s。
通过 jstack 发现存在 reader 线程卡在了 close connection。
"0-0-2-reader" #25 prio=5 os_prio=0 tid=0x00007f73ac012000 nid=0x59 runnable [0x00007f73be59b000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:140) at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:173) - locked <0x00000005c0785458> (a com.mysql.jdbc.util.ReadAheadInputStream) at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2911) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3387) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870) at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1928) at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:378) at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:358) at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:158) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:6702) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:842) at com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2354) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1536) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4262) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1510) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:498) at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:505) at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:225) at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81) at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57) at java.lang.Thread.run(Thread.java:748)
你好,这个问题怎么解决的
2021-02-19 17:09:38 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:09:38.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:09:48 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:09:48.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 921B/s, 51 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:08 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:08.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:28 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:28.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:48 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:48.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:58 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:58.454 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:11:08 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:11:08.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:11:18 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:11:18.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00%
只有几千条数据也卡住...
datax可能会因为脏数据太多导致频繁回滚操作,进一步让jvm内存触发gc,让速度降低到0,可以在sql语句中规避脏数据的写入来规避. 通常由字段类型转换导致. https://blog.csdn.net/weixin_40366684/article/details/110391762
可调大jvm的metaspace
-XX:MetaspaceSize=512m
-XX:MaxMetaspaceSize=512m
https://blog.csdn.net/u011381576/article/details/79635867?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.control&dist_request_id=0169bfdd-5853-48c8-afc2-372384dcb38f&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.control
我也遇到了类似的问题:mysql-to-hdfs,数据量比较大,跑一段时间后发现 Speed 0B/s。
通过 jstack 发现存在 reader 线程卡在了 close connection。
"0-0-2-reader" #25 prio=5 os_prio=0 tid=0x00007f73ac012000 nid=0x59 runnable [0x00007f73be59b000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:171) at java.net.SocketInputStream.read(SocketInputStream.java:141) at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:140) at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:173) - locked <0x00000005c0785458> (a com.mysql.jdbc.util.ReadAheadInputStream) at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2911) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3387) at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870) at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1928) at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:378) at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:358) at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:158) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:6702) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:842) at com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2354) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1536) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4262) at com.mysql.jdbc.ConnectionImpl.close(ConnectionImpl.java:1510) - locked <0x00000005c07854a8> (a com.mysql.jdbc.JDBC4Connection) at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:498) at com.alibaba.datax.plugin.rdbms.util.DBUtil.closeDBResources(DBUtil.java:505) at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:225) at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81) at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57) at java.lang.Thread.run(Thread.java:748)
看下这个issues :https://github.com/alibaba/DataX/issues/1005
+1吐血了
可以尝试在reader配置的jdbcUrl中加如下参数 jdbc:mysql://localhost:3306/db_name?sessionVariables=MAX_EXECUTION_TIME=68400000
相同的问题有解决吗
没有,我的问题可能是出现在服务器资源不够负载过大,你可以看看你的负载,我是偶尔一次可以。
------------------ 原始邮件 ------------------ 发件人: "alibaba/DataX" @.>; 发送时间: 2021年9月2日(星期四) 下午4:54 @.>; @.@.>; 主题: Re: [alibaba/DataX] 任务正常启动后,数据写入一直不变,也不报错 (#601)
相同的问题有解决吗
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
2021-02-19 17:09:38 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:09:38.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:09:48 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:09:48.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 921B/s, 51 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:08 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:08.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:28 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:28.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:48 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:48.453 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:10:58 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:10:58.454 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:11:08 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:11:08.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00% 2021-02-19 17:11:18 [AnalysisStatistics.analysisStatisticsLog-53] 2021-02-19 17:11:18.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 512 records, 9216 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.001s | All Task WaitReaderTime 0.000s | Percentage 0.00%
只有几千条数据也卡住...
我也遇到了同样的问题,请问有解决吗?
没有查到原因,不是每次都会出现,初步分析是卡到数据库写入。我的解决方案是如果任务一直卡着就关闭当前任务,按超时处理,参考这个:https://blog.csdn.net/u010978399/article/details/117963887
出现原因:由于Starrocks设定了查询超时时间,DataX数据同步使用流式数据读取,导致数据读取超过了数据库指定的查询超时时间,数据读取被中断,DataX没有报错,出现了Speed一直为0的情况。
处理方法:
- 可以暂时将数据库的query_timout参数调大,保证数据同步时间不会超过该值。
set global query_timeout=3000;
- 在当前SQL语句中设置query_timeout的值,详见:https://docs.starrocks.com/zh-cn/latest/reference/System_variable
SELECT /*+ SET_VAR(query_timeout = 1) */ name FROM people ORDER BY name;
具体说明:
- DataX的数据同步,采用的是使用java.sql.Statement从数据库拉取数据,并且将fetchSize设置成了Integer.MIN_VALUE, 该方式使用流数据接受方式,每次只从服务器接受部分数据,直到数据处理完毕。
源码如下:
/**
* 任务初始化
*/
public void init() {
this.originalConfig = super.getPluginJobConf();
Integer userConfigedFetchSize = this.originalConfig.getInt(Constant.FETCH_SIZE);
if (userConfigedFetchSize != null) {
LOG.warn("对 mysqlreader 不需要配置 fetchSize, mysqlreader 将会忽略这项配置. 如果您不想再看到此警告,请去除fetchSize 配置.");
}
// 默认被设置为Integer.MIN_VALUE
this.originalConfig.set(Constant.FETCH_SIZE, Integer.MIN_VALUE);
this.commonRdbmsReaderJob = new CommonRdbmsReader.Job(DATABASE_TYPE);
this.commonRdbmsReaderJob.init(this.originalConfig);
}
/**
* 任务调用
**/
public void startRead(RecordSender recordSender) {
int fetchSize = this.readerSliceConfig.getInt(Constant.FETCH_SIZE);
this.commonRdbmsReaderTask.startRead(this.readerSliceConfig, recordSender,
super.getTaskPluginCollector(), fetchSize);
}
/**
* a wrapped method to execute select-like sql statement .
*
* @param conn Database connection .
* @param sql sql statement to be executed
* @param fetchSize
* @param queryTimeout unit:second
* @return
* @throws SQLException
*/
public static ResultSet query(Connection conn, String sql, int fetchSize, int queryTimeout)
throws SQLException {
// make sure autocommit is off
conn.setAutoCommit(false);
// ResultSet.RTYPE_FORWORD_ONLY,只可向前滚动;
// ResultSet.CONCUR_READ_ONLY,指定不可以更新 ResultSet
Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,
ResultSet.CONCUR_READ_ONLY);
// 指定了fetchSize为Integer.MIN_VALUE
stmt.setFetchSize(fetchSize);
stmt.setQueryTimeout(queryTimeout);
return query(stmt, sql);
}
- 数据库中配置了数据的查询超时时间,Starrocks中该配置名称为query_timeout。默认值为300s。如果一个查询持续时间超过了该参数的值,数据库就会返回查询超时错误。
show variables like '%timeout%';
# 结果如下:
interactive_timeout 3600
net_read_timeout 60
net_write_timeout 60
new_planner_optimize_timeout 3000
query_delivery_timeout 300
query_timeout 300
tx_visible_wait_timeout 10
wait_timeout 28800
- DataX未将该异常抛出,导致程序没有中止,实际数据库的查询已经结束,所有出现了Speed为0的现象。
2022-08-26 13:58:27.724 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00%
2022-08-26 13:58:37.731 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00%
- 代码调试,当超过query_timeout时,抛出如下错误。
经DataX智能分析,该任务最可能的错误原因是:
com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-07], Description:[读取数据库数据失败. 请检查您的配置的 column/table/where/querySql或者向 DBA 寻求帮助.]. - 执行的SQL为: select id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7 具体错误信息为:com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query exceeded time limit of 30 seconds
at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26)
at com.alibaba.datax.plugin.rdbms.util.RdbmsException.asQueryException(RdbmsException.java:81)
at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:220)
at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81)
at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57)
at java.lang.Thread.run(Thread.java:748)
- 测试用例如下:
测试表信息:
- 字段数:34
- 表数据量:12965900条
- 表大小:9.074 GB
不同query_timeout测试结果如下:
query_timeout值 | 成功导出数据量 |
---|---|
30s | 77792 |
300s | 778208 |
3000s | 7821522 |
job_json如下:
{
"core":{
"transport": {
"channel":{
"speed":{
"byte": 5242880
}
}
}
},
"job": {
"setting": {
"speed": {
"channel":1,
"batchSize": 2048,
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "root",
"column" : [
"id","coordinate","latitude","domain_name","uri","url","user_name","city_name","city",
"district","province","postcode","country","first_name","first_romanized_name","last_name",
"name_female","ssn","phone_number","email","date","year_data","month_data","day_of_week","pystr",
"random_element","random_letter","company","company_suffix","company_prefix","company_email","sentence",
"text","word"
],
"splitPk":"id",
"connection": [
{
"table": ["test_v7"],
"jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST?connectTimeout=60000000&socketTimeout=60000000"]
}
]
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "E:\\opt",
"fileName": "text_datax_export",
"writeMode": "truncate",
"dateFormat": "yyyy-MM-dd"
}
}
}
]
}
}
使用SQL设置变量的job如下
{
"core":{
"transport": {
"channel":{
"speed":{
"byte": 5242880
}
}
}
},
"job": {
"setting": {
"speed": {
"channel":1,
"batchSize": 2048,
},
"errorLimit": {
"record": 0,
"percentage": 0.02
}
},
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "root",
"connection": [
{
"querySql": [
"select /*+ SET_VAR(query_timeout = 100) */ id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7"
],
"jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST"]
}
]
}
},
"writer": {
"name": "txtfilewriter",
"parameter": {
"path": "E:\\opt",
"fileName": "text_datax_export",
"writeMode": "truncate",
"dateFormat": "yyyy-MM-dd"
}
}
}
]
}
}
非常感谢
发自我的iPhone
在 2022年8月29日,10:02,Desperado2 @.***> 写道:
出现原因:由于Starrocks设定了查询超时时间,DataX数据同步使用流式数据读取,导致数据读取超过了数据库指定的查询超时时间,数据读取被中断,DataX没有报错,出现了Speed一直为0的情况。
处理方法:
可以暂时将数据库的query_timout参数调大,保证数据同步时间不会超过该值。 set global query_timeout=3000; 在当前SQL语句中设置query_timeout的值,详见:https://docs.starrocks.com/zh-cn/latest/reference/System_variable SELECT /*+ SET_VAR(query_timeout = 1) */ name FROM people ORDER BY name; 具体说明:
DataX的数据同步,采用的是使用java.sql.Statement从数据库拉取数据,并且将fetchSize设置成了Integer.MIN_VALUE, 该方式使用流数据接受方式,每次只从服务器接受部分数据,直到数据处理完毕。 源码如下:
/**
- 任务初始化 */
public void init() { this.originalConfig = super.getPluginJobConf();
Integer userConfigedFetchSize = this.originalConfig.getInt(Constant.FETCH_SIZE); if (userConfigedFetchSize != null) { LOG.warn("对 mysqlreader 不需要配置 fetchSize, mysqlreader 将会忽略这项配置. 如果您不想再看到此警告,请去除fetchSize 配置."); }
// 默认被设置为Integer.MIN_VALUE this.originalConfig.set(Constant.FETCH_SIZE, Integer.MIN_VALUE);
this.commonRdbmsReaderJob = new CommonRdbmsReader.Job(DATABASE_TYPE); this.commonRdbmsReaderJob.init(this.originalConfig);
}
/**
任务调用 **/ public void startRead(RecordSender recordSender) { int fetchSize = this.readerSliceConfig.getInt(Constant.FETCH_SIZE);
this.commonRdbmsReaderTask.startRead(this.readerSliceConfig, recordSender, super.getTaskPluginCollector(), fetchSize); }
/**
a wrapped method to execute select-like sql statement .
@param conn Database connection .
@param sql sql statement to be executed
@param fetchSize
@param queryTimeout unit:second
@return
@throws SQLException */ public static ResultSet query(Connection conn, String sql, int fetchSize, int queryTimeout) throws SQLException { // make sure autocommit is off conn.setAutoCommit(false); // ResultSet.RTYPE_FORWORD_ONLY,只可向前滚动; // ResultSet.CONCUR_READ_ONLY,指定不可以更新 ResultSet Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
// 指定了fetchSize为Integer.MIN_VALUE stmt.setFetchSize(fetchSize); stmt.setQueryTimeout(queryTimeout); return query(stmt, sql); } 数据库中配置了数据的查询超时时间,Starrocks中该配置名称为query_timeout。默认值为300s。如果一个查询持续时间超过了该参数的值,数据库就会返回查询超时错误。 show variables like '%timeout%';
结果如下:
interactive_timeout 3600 net_read_timeout 60 net_write_timeout 60 new_planner_optimize_timeout 3000 query_delivery_timeout 300 query_timeout 300 tx_visible_wait_timeout 10 wait_timeout 28800 DataX未将该异常抛出,导致程序没有中止,实际数据库的查询已经结束,所有出现了Speed为0的现象。 2022-08-26 13:58:27.724 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00% 2022-08-26 13:58:37.731 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00% 代码调试,当超过query_timeout时,抛出如下错误。 经DataX智能分析,该任务最可能的错误原因是: com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-07], Description:[读取数据库数据失败. 请检查您的配置的 column/table/where/querySql或者向 DBA 寻求帮助.]. - 执行的SQL为: select id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7 具体错误信息为:com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query exceeded time limit of 30 seconds at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26) at com.alibaba.datax.plugin.rdbms.util.RdbmsException.asQueryException(RdbmsException.java:81) at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:220) at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81) at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57) at java.lang.Thread.run(Thread.java:748) 测试用例如下: 测试表信息:
字段数:34 表数据量:12965900条 表大小:9.074 GB 不同query_timeout测试结果如下:
query_timeout值 成功导出数据量 30s 77792 300s 778208 3000s 7821522 job_json如下:
{ "core":{ "transport": { "channel":{ "speed":{ "byte": 5242880 } } } }, "job": { "setting": { "speed": { "channel":1, "batchSize": 2048, }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "root", "column" : [ "id","coordinate","latitude","domain_name","uri","url","user_name","city_name","city", "district","province","postcode","country","first_name","first_romanized_name","last_name", "name_female","ssn","phone_number","email","date","year_data","month_data","day_of_week","pystr", "random_element","random_letter","company","company_suffix","company_prefix","company_email","sentence", "text","word" ], "splitPk":"id", "connection": [ { "table": ["test_v7"], "jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST?connectTimeout=60000000&socketTimeout=60000000"] } ] } }, "writer": { "name": "txtfilewriter", "parameter": { "path": "E:\opt", "fileName": "text_datax_export", "writeMode": "truncate", "dateFormat": "yyyy-MM-dd" } } } ] } } 使用SQL设置变量的job如下
{ "core":{ "transport": { "channel":{ "speed":{ "byte": 5242880 } } } }, "job": { "setting": { "speed": { "channel":1, "batchSize": 2048, }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "root", "connection": [ { "querySql": [ "select /*+ SET_VAR(query_timeout = 100) */ id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7" ], "jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST"] } ] } }, "writer": { "name": "txtfilewriter", "parameter": { "path": "E:\opt", "fileName": "text_datax_export", "writeMode": "truncate", "dateFormat": "yyyy-MM-dd" } } } ] } } — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.
👍
有解决的吗?
2022-11-29 16:49:23.401 [job-0] INFO StandAloneJobContainerCommunicator - Total 0 records, 0 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.000s | All Task WaitReaderTime 0.000s | Percentage 0.00%
2022-11-29 16:49:33.406 [job-0] INFO StandAloneJobContainerCommunicator - Total 2560 records, 1028585 bytes | Speed 100.45KB/s, 256 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.002s | All Task WaitReaderTime 0.545s | Percentage 0.00%
2022-11-29 16:49:43.411 [job-0] INFO StandAloneJobContainerCommunicator - Total 4608 records, 1850706 bytes | Speed 80.29KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 12.173s | All Task WaitReaderTime 0.780s | Percentage 0.00%
2022-11-29 16:49:53.414 [job-0] INFO StandAloneJobContainerCommunicator - Total 6656 records, 2672281 bytes | Speed 80.23KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 22.329s | All Task WaitReaderTime 0.982s | Percentage 0.00%
2022-11-29 16:50:03.418 [job-0] INFO StandAloneJobContainerCommunicator - Total 8704 records, 3494546 bytes | Speed 80.30KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 32.195s | All Task WaitReaderTime 1.227s | Percentage 0.00%
2022-11-29 16:50:13.420 [job-0] INFO StandAloneJobContainerCommunicator - Total 10752 records, 4319693 bytes | Speed 80.58KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 40.863s | All Task WaitReaderTime 1.499s | Percentage 0.00%
2022-11-29 16:50:23.421 [job-0] INFO StandAloneJobContainerCommunicator - Total 13504 records, 5428467 bytes | Speed 108.28KB/s, 275 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 57.732s | All Task WaitReaderTime 1.766s | Percentage 0.00%
2022-11-29 16:50:33.426 [job-0] INFO StandAloneJobContainerCommunicator - Total 16896 records, 6795159 bytes | Speed 133.47KB/s, 339 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 65.173s | All Task WaitReaderTime 2.287s | Percentage 0.00%
2022-11-29 16:50:43.428 [job-0] INFO StandAloneJobContainerCommunicator - Total 18944 records, 7619968 bytes | Speed 80.55KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 73.436s | All Task WaitReaderTime 2.557s | Percentage 0.00%
2022-11-29 16:50:53.433 [job-0] INFO StandAloneJobContainerCommunicator - Total 20992 records, 8445006 bytes | Speed 80.57KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 81.569s | All Task WaitReaderTime 2.828s | Percentage 0.00%
2022-11-29 16:51:03.436 [job-0] INFO StandAloneJobContainerCommunicator - Total 23040 records, 9269992 bytes | Speed 80.56KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 89.167s | All Task WaitReaderTime 3.108s | Percentage 0.00%
2022-11-29 16:51:13.442 [job-0] INFO StandAloneJobContainerCommunicator - Total 27136 records, 10915551 bytes | Speed 160.70KB/s, 409 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 104.257s | All Task WaitReaderTime 3.667s | Percentage 0.00%
2022-11-29 16:51:23.444 [job-0] INFO StandAloneJobContainerCommunicator - Total 29184 records, 11736658 bytes | Speed 80.19KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 112.227s | All Task WaitReaderTime 3.939s | Percentage 0.00%
2022-11-29 16:51:33.445 [job-0] INFO StandAloneJobContainerCommunicator - Total 31232 records, 12558113 bytes | Speed 80.22KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 119.752s | All Task WaitReaderTime 4.204s | Percentage 0.00%
2022-11-29 16:51:43.454 [job-0] INFO StandAloneJobContainerCommunicator - Total 33280 records, 13381446 bytes | Speed 80.40KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 127.553s | All Task WaitReaderTime 4.496s | Percentage 0.00%
2022-11-29 16:51:53.456 [job-0] INFO StandAloneJobContainerCommunicator - Total 37376 records, 15028167 bytes | Speed 160.81KB/s, 409 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 142.874s | All Task WaitReaderTime 5.021s | Percentage 0.00%
2022-11-29 16:52:03.462 [job-0] INFO StandAloneJobContainerCommunicator - Total 39424 records, 15851508 bytes | Speed 80.40KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 150.918s | All Task WaitReaderTime 5.309s | Percentage 0.00%
2022-11-29 16:52:13.463 [job-0] INFO StandAloneJobContainerCommunicator - Total 41472 records, 16674817 bytes | Speed 80.40KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 159.122s | All Task WaitReaderTime 5.573s | Percentage 0.00%
2022-11-29 16:52:23.465 [job-0] INFO StandAloneJobContainerCommunicator - Total 43520 records, 17498144 bytes | Speed 80.40KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 166.976s | All Task WaitReaderTime 5.731s | Percentage 0.00%
2022-11-29 16:52:33.469 [job-0] INFO StandAloneJobContainerCommunicator - Total 46880 records, 18849063 bytes | Speed 131.92KB/s, 336 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 182.649s | All Task WaitReaderTime 6.172s | Percentage 0.00%
2022-11-29 16:52:43.475 [job-0] INFO StandAloneJobContainerCommunicator - Total 49664 records, 19968479 bytes | Speed 109.32KB/s, 278 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 190.334s | All Task WaitReaderTime 6.582s | Percentage 0.00%
2022-11-29 16:52:53.481 [job-0] INFO StandAloneJobContainerCommunicator - Total 51712 records, 20791919 bytes | Speed 80.41KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 198.180s | All Task WaitReaderTime 6.857s | Percentage 0.00%
2022-11-29 16:53:03.486 [job-0] INFO StandAloneJobContainerCommunicator - Total 53760 records, 21615435 bytes | Speed 80.42KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 205.826s | All Task WaitReaderTime 7.122s | Percentage 0.00%
2022-11-29 16:53:13.491 [job-0] INFO StandAloneJobContainerCommunicator - Total 57728 records, 23210716 bytes | Speed 155.79KB/s, 396 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 220.918s | All Task WaitReaderTime 7.641s | Percentage 0.00%
2022-11-29 16:53:23.495 [job-0] INFO StandAloneJobContainerCommunicator - Total 59904 records, 24085633 bytes | Speed 85.44KB/s, 217 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 229.898s | All Task WaitReaderTime 7.928s | Percentage 0.00%
2022-11-29 16:53:33.496 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 80.42KB/s, 204 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00%
2022-11-29 16:53:43.498 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00%
2022-11-29 16:53:53.504 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00%
2022-11-29 16:54:03.505 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00%
2022-11-29 16:54:13.511 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00%
2022-11-29 16:54:13.517 [job-0] INFO VMInfo -
[delta cpu info] =>
curDeltaCpu | averageCpu | maxDeltaCpu | minDeltaCpu
-1.00% | -1.00% | -1.00% | -1.00%
[delta memory info] =>
NAME | used_size | used_percent | max_used_size | max_percent
CodeHeap 'profiled nmethods' | 4.76MB | 71.11% | 4.76MB | 71.11%
G1 Old Gen | 351.42MB | 55.78% | 351.42MB | 55.78%
G1 Survivor Space | 25.00MB | 100.00% | 25.00MB | 100.00%
CodeHeap 'non-profiled nmethods' | 1.84MB | 75.59% | 1.84MB | 75.59%
Compressed Class Space | 2.77MB | 92.42% | 2.77MB | 92.42%
Metaspace | 27.69MB | 96.33% | 27.69MB | 96.33%
G1 Eden Space | 8.00MB | 2.17% | 8.00MB | 2.17%
CodeHeap 'non-nmethods' | 1.17MB | 48.13% | 1.17MB | 48.13%
[delta gc info] =>
NAME | curDeltaGCCount | totalGCCount | maxDeltaGCCount | minDeltaGCCount | curDeltaGCTime | totalGCTime | maxDeltaGCTime | minDeltaGCTime
G1 Young Generation | 93 | 93 | 93 | 93 | 1.606s | 1.606s | 1.606s | 1.606s
G1 Old Generation | 0 | 0 | 0 | 0 | 0.000s | 0.000s | 0.000s | 0.000s
2022-11-29 16:54:23.520 [job-0] INFO StandAloneJobContainerCommunicator - Total 61952 records, 24909129 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 239.507s | All Task WaitReaderTime 8.118s | Percentage 0.00% 2022-11-29 16:54:29.737 [0-0-0-writer] WARN CommonRdbmsWriter$Task - 回滚此次写入, 采用每次写入一行方式提交. 因为:Communications link failure
我的不会一直跑,数据库直接宕机了
请问oracle,如何在jdbcurl里设置query_timeout呢?
检查一下是否reader数据库是否设置了超时时间,查询数据超时了。
出现原因:由于Starrocks设定了查询超时时间,DataX数据同步使用流式数据读取,导致数据读取超过了数据库指定的查询超时时间,数据读取被中断,DataX没有报错,出现了Speed一直为0的情况。
处理方法:
- 可以暂时将数据库的query_timout参数调大,保证数据同步时间不会超过该值。
set global query_timeout=3000;
- 在当前SQL语句中设置query_timeout的值,详见:https://docs.starrocks.com/zh-cn/latest/reference/System_variable
SELECT /*+ SET_VAR(query_timeout = 1) */ name FROM people ORDER BY name;
具体说明:
- DataX的数据同步,采用的是使用java.sql.Statement从数据库拉取数据,并且将fetchSize设置成了Integer.MIN_VALUE, 该方式使用流数据接受方式,每次只从服务器接受部分数据,直到数据处理完毕。
源码如下:
/** * 任务初始化 */ public void init() { this.originalConfig = super.getPluginJobConf(); Integer userConfigedFetchSize = this.originalConfig.getInt(Constant.FETCH_SIZE); if (userConfigedFetchSize != null) { LOG.warn("对 mysqlreader 不需要配置 fetchSize, mysqlreader 将会忽略这项配置. 如果您不想再看到此警告,请去除fetchSize 配置."); } // 默认被设置为Integer.MIN_VALUE this.originalConfig.set(Constant.FETCH_SIZE, Integer.MIN_VALUE); this.commonRdbmsReaderJob = new CommonRdbmsReader.Job(DATABASE_TYPE); this.commonRdbmsReaderJob.init(this.originalConfig); } /** * 任务调用 **/ public void startRead(RecordSender recordSender) { int fetchSize = this.readerSliceConfig.getInt(Constant.FETCH_SIZE); this.commonRdbmsReaderTask.startRead(this.readerSliceConfig, recordSender, super.getTaskPluginCollector(), fetchSize); } /** * a wrapped method to execute select-like sql statement . * * @param conn Database connection . * @param sql sql statement to be executed * @param fetchSize * @param queryTimeout unit:second * @return * @throws SQLException */ public static ResultSet query(Connection conn, String sql, int fetchSize, int queryTimeout) throws SQLException { // make sure autocommit is off conn.setAutoCommit(false); // ResultSet.RTYPE_FORWORD_ONLY,只可向前滚动; // ResultSet.CONCUR_READ_ONLY,指定不可以更新 ResultSet Statement stmt = conn.createStatement(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY); // 指定了fetchSize为Integer.MIN_VALUE stmt.setFetchSize(fetchSize); stmt.setQueryTimeout(queryTimeout); return query(stmt, sql); }
- 数据库中配置了数据的查询超时时间,Starrocks中该配置名称为query_timeout。默认值为300s。如果一个查询持续时间超过了该参数的值,数据库就会返回查询超时错误。
show variables like '%timeout%'; # 结果如下: interactive_timeout 3600 net_read_timeout 60 net_write_timeout 60 new_planner_optimize_timeout 3000 query_delivery_timeout 300 query_timeout 300 tx_visible_wait_timeout 10 wait_timeout 28800
- DataX未将该异常抛出,导致程序没有中止,实际数据库的查询已经结束,所有出现了Speed为0的现象。
2022-08-26 13:58:27.724 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00% 2022-08-26 13:58:37.731 [job-0] INFO StandAloneJobContainerCommunicator - Total 778208 records, 497121061 bytes | Speed 0B/s, 0 records/s | Error 0 records, 0 bytes | All Task WaitWriterTime 0.046s | All Task WaitReaderTime 291.284s | Percentage 0.00%
- 代码调试,当超过query_timeout时,抛出如下错误。
经DataX智能分析,该任务最可能的错误原因是: com.alibaba.datax.common.exception.DataXException: Code:[DBUtilErrorCode-07], Description:[读取数据库数据失败. 请检查您的配置的 column/table/where/querySql或者向 DBA 寻求帮助.]. - 执行的SQL为: select id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7 具体错误信息为:com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Query exceeded time limit of 30 seconds at com.alibaba.datax.common.exception.DataXException.asDataXException(DataXException.java:26) at com.alibaba.datax.plugin.rdbms.util.RdbmsException.asQueryException(RdbmsException.java:81) at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:220) at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81) at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57) at java.lang.Thread.run(Thread.java:748)
- 测试用例如下:
测试表信息:
- 字段数:34
- 表数据量:12965900条
- 表大小:9.074 GB
不同query_timeout测试结果如下:
query_timeout值 成功导出数据量 30s 77792 300s 778208 3000s 7821522 job_json如下:
{ "core":{ "transport": { "channel":{ "speed":{ "byte": 5242880 } } } }, "job": { "setting": { "speed": { "channel":1, "batchSize": 2048, }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "root", "column" : [ "id","coordinate","latitude","domain_name","uri","url","user_name","city_name","city", "district","province","postcode","country","first_name","first_romanized_name","last_name", "name_female","ssn","phone_number","email","date","year_data","month_data","day_of_week","pystr", "random_element","random_letter","company","company_suffix","company_prefix","company_email","sentence", "text","word" ], "splitPk":"id", "connection": [ { "table": ["test_v7"], "jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST?connectTimeout=60000000&socketTimeout=60000000"] } ] } }, "writer": { "name": "txtfilewriter", "parameter": { "path": "E:\\opt", "fileName": "text_datax_export", "writeMode": "truncate", "dateFormat": "yyyy-MM-dd" } } } ] } }
使用SQL设置变量的job如下
{ "core":{ "transport": { "channel":{ "speed":{ "byte": 5242880 } } } }, "job": { "setting": { "speed": { "channel":1, "batchSize": 2048, }, "errorLimit": { "record": 0, "percentage": 0.02 } }, "content": [ { "reader": { "name": "mysqlreader", "parameter": { "username": "root", "password": "root", "connection": [ { "querySql": [ "select /*+ SET_VAR(query_timeout = 100) */ id,coordinate,latitude,domain_name,uri,url,user_name,city_name,city,district,province,postcode,country,first_name,first_romanized_name,last_name,name_female,ssn,phone_number,email,date,year_data,month_data,day_of_week,pystr,random_element,random_letter,company,company_suffix,company_prefix,company_email,sentence,text,word from test_v7" ], "jdbcUrl": ["jdbc:mysql://192.168.20.213:9030/TEST"] } ] } }, "writer": { "name": "txtfilewriter", "parameter": { "path": "E:\\opt", "fileName": "text_datax_export", "writeMode": "truncate", "dateFormat": "yyyy-MM-dd" } } } ] } }
非常感谢
mysql2clickhouse 卡在这儿什么情况
"0-7-309-reader" #331 prio=5 os_prio=0 tid=0x00007ff810013000 nid=0xfc7 runnable [0x00007ff7e77b5000]
java.lang.Thread.State: RUNNABLE
at java.io.FileOutputStream.writeBytes(Native Method)
at java.io.FileOutputStream.write(FileOutputStream.java:326)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:122)
- locked <0x00000000fb222748> (a java.io.BufferedOutputStream)
at java.io.PrintStream.write(PrintStream.java:480)
- locked <0x00000000fb222760> (a java.io.PrintStream)
at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)
at sun.nio.cs.StreamEncoder.implFlushBuffer(StreamEncoder.java:291)
at sun.nio.cs.StreamEncoder.flushBuffer(StreamEncoder.java:104)
- locked <0x00000000fb2227c8> (a java.io.OutputStreamWriter)
at java.io.OutputStreamWriter.flushBuffer(OutputStreamWriter.java:185)
at java.io.PrintStream.write(PrintStream.java:527)
- locked <0x00000000fb222760> (a java.io.PrintStream)
at java.io.PrintStream.print(PrintStream.java:669)
at java.io.PrintStream.println(PrintStream.java:806)
- locked <0x00000000fb222760> (a java.io.PrintStream)
at com.mysql.jdbc.log.StandardLogger.logInternal(StandardLogger.java:318)
at com.mysql.jdbc.log.StandardLogger.logWarn(StandardLogger.java:240)
at com.mysql.jdbc.MysqlIO.doHandshake(MysqlIO.java:1177)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2199)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2230)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2025)
- locked <0x00000000fb2229a0> (a com.mysql.jdbc.JDBC4Connection)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:778)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.GeneratedConstructorAccessor11.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:386)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:330)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:208)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:395)
- locked <0x00000000fac3c070> (a java.lang.Class for com.alibaba.datax.plugin.rdbms.util.DBUtil)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.connect(DBUtil.java:387)
- locked <0x00000000fac3c070> (a java.lang.Class for com.alibaba.datax.plugin.rdbms.util.DBUtil)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.access$000(DBUtil.java:22)
at com.alibaba.datax.plugin.rdbms.util.DBUtil$3.call(DBUtil.java:322)
at com.alibaba.datax.plugin.rdbms.util.DBUtil$3.call(DBUtil.java:319)
at com.alibaba.datax.common.util.RetryUtil$Retry.call(RetryUtil.java:164)
at com.alibaba.datax.common.util.RetryUtil$Retry.doRetry(RetryUtil.java:111)
at com.alibaba.datax.common.util.RetryUtil.executeWithRetry(RetryUtil.java:30)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.getConnection(DBUtil.java:319)
at com.alibaba.datax.plugin.rdbms.util.DBUtil.getConnection(DBUtil.java:303)
at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:185)
at com.alibaba.datax.plugin.reader.mysqlreader.MysqlReader$Task.startRead(MysqlReader.java:81)
at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:57)
at java.lang.Thread.run(Thread.java:748)
Locked ownable synchronizers:
- None```