dinky
dinky copied to clipboard
[Bug] [oracle整库同步] oracle->starrocks整库同步flink报错
Search before asking
- [X] I had searched in the issues and found no similar issues.
What happened
dinky:0.7.1 starrocks:2.4.3 flink:1.5.3 cdc:2.3
场景:
- oracle->strarocks的单表同步已经调试没问题
- 在整库同步的时候 flink 报错, dinky 的日志没有报错
同步sql: EXECUTE CDCSOURCE jobname2 WITH ( 'connector' = 'oracle-cdc', 'hostname' = '', 'port' = '1521', 'username' = 'flink', 'password'='', 'checkpoint' = '12000', 'scan.startup.mode' = 'initial', 'parallelism' = '1', 'database-name' = 'ZZMESDB', --'schema-name' = 'TEST', 'table-name' = 'TEST.FLINK_TEST03', 'debezium.log.mining.strategy' = 'online_catalog', 'sink.connector' = 'starrocks', 'sink.jdbc-url' = 'jdbc:mysql://*****:2030', 'sink.load-url' = ':1030', 'sink.username' = 'root', 'sink.password' = '*', 'sink.sink.db' = 'flink_test', 'sink.table.lower' = 'true', 'sink.database-name' = 'flink_test', 'sink.table-name' = '${tableName}', 'sink.sink.properties.format' = 'json', 'sink.sink.properties.strip_outer_array' = 'true', 'sink.sink.max-retries' = '10', 'sink.sink.buffer-flush.interval-ms' = '15000', 'sink.sink.parallelism' = '1' )
What you expected to happen
flink的报错: 2023-02-15 08:41:40 com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped. at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:42) at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:325) at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:71) at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:160) at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:122) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.errors.SchemaBuilderException: Invalid default value at com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.data.SchemaBuilder.defaultValue(SchemaBuilder.java:131) at io.debezium.relational.TableSchemaBuilder.addField(TableSchemaBuilder.java:374) at io.debezium.relational.TableSchemaBuilder.lambda$create$2(TableSchemaBuilder.java:119) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) at io.debezium.relational.TableSchemaBuilder.create(TableSchemaBuilder.java:117) at io.debezium.relational.RelationalDatabaseSchema.buildAndRegisterSchema(RelationalDatabaseSchema.java:130) at io.debezium.connector.oracle.OracleDatabaseSchema.lambda$applySchemaChange$0(OracleDatabaseSchema.java:73) at java.lang.Iterable.forEach(Iterable.java:75) at io.debezium.connector.oracle.OracleDatabaseSchema.applySchemaChange(OracleDatabaseSchema.java:72) at io.debezium.pipeline.EventDispatcher$SchemaChangeEventReceiver.schemaChangeEvent(EventDispatcher.java:522) at io.debezium.connector.oracle.OracleSchemaChangeEventEmitter.emitSchemaChangeEvent(OracleSchemaChangeEventEmitter.java:113) at io.debezium.pipeline.EventDispatcher.dispatchSchemaChangeEvent(EventDispatcher.java:297) at io.debezium.connector.oracle.logminer.LogMinerQueryResultProcessor.dispatchSchemaChangeEventAndGetTableForNewCapturedTable(LogMinerQueryResultProcessor.java:336) at io.debezium.connector.oracle.logminer.LogMinerQueryResultProcessor.getTableForDmlEvent(LogMinerQueryResultProcessor.java:323) at io.debezium.connector.oracle.logminer.LogMinerQueryResultProcessor.processResult(LogMinerQueryResultProcessor.java:257) at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:280) ... 8 more Caused by: com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.errors.DataException: Invalid Java object for schema type INT8: class java.lang.String for field: "null" at com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:245) at com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.data.ConnectSchema.validateValue(ConnectSchema.java:213) at com.ververica.cdc.connectors.shaded.org.apache.kafka.connect.data.SchemaBuilder.defaultValue(SchemaBuilder.java:129) ... 31 more
How to reproduce
dinky日志:
[dlink] 2023-02-15 08:38:44 CST INFO com.dlink.executor.Executor 274 loginFromKeytabIfNeed - Simple authentication mode
[dlink] 2023-02-15 08:38:44 CST INFO com.dlink.executor.Executor 274 loginFromKeytabIfNeed - Simple authentication mode
[dlink] 2023-02-15 08:38:44 CST INFO com.dlink.executor.Executor 274 loginFromKeytabIfNeed - Simple authentication mode
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 78 build - Start build CDCSOURCE Task...
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 165 build - A total of 0 tables were detected...
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 174 build - Set parallelism: 1
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 178 build - Set checkpoint: 12000
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 181 build - Build oracle-cdc successful...
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.cdc.sql.SQLSinkBuilder 220 build - Build deserialize successful...
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.cdc.sql.SQLSinkBuilder 277 build - A total of 0 table cdc sync were build successfull...
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 190 build - Build CDCSOURCE Task successful!
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: jobmanager.rpc.address, localhost
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: jobmanager.rpc.port, 6123
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: jobmanager.bind-host, 0.0.0.0
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: jobmanager.memory.process.size, 1600m
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: taskmanager.bind-host, localhost
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: taskmanager.host, localhost
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: taskmanager.memory.process.size, 1728m
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: taskmanager.numberOfTaskSlots, 1
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: parallelism.default, 1
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: high-availability, zookeeper
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: high-availability.storageDir, hdfs:///flink/ha/
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: high-availability.zookeeper.quorum, tbddn1:2181,tbddn2:2181,tbddn3:2181
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: state.checkpoints.dir, hdfs:///flink/checkpoints
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: state.savepoints.dir, hdfs:///flink/savepoints
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: jobmanager.execution.failover-strategy, region
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: rest.port, 8085
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: rest.address, localhost
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: rest.bind-address, 0.0.0.0
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.configuration.GlobalConfiguration 213 loadYAMLResource - Loading configuration property: classloader.check-leaked-classloader, false
[dlink] 2023-02-15 08:39:15 CST WARN org.apache.flink.yarn.configuration.YarnLogConfigUtil 73 discoverLogConfigFile - The configuration directory ('/usr/local/flink/conf') already contains a LOG4J config file.If you want to use logback, then please delete or rename the log configuration file.
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.hadoop.yarn.client.RMProxy 133 newProxyInstance - Connecting to ResourceManager at TBDCM1/10.10.20.86:8032
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 208 getLocalFlinkDistPath - No path for the flink jar passed. Using the location of class org.apache.flink.yarn.YarnClusterDescriptor to locate the jar
[dlink] 2023-02-15 08:39:15 CST WARN org.apache.flink.yarn.YarnClusterDescriptor 481 deployJobCluster - Job Clusters are deprecated since Flink 1.15. Please use an Application Cluster/Application Mode instead.
[dlink] 2023-02-15 08:39:15 CST WARN org.apache.flink.yarn.YarnClusterDescriptor 351 isReadyForDeployment - Neither the HADOOP_CONF_DIR nor the YARN_CONF_DIR environment variable is set. The Flink YARN Client needs one of these to be set to properly load the Hadoop configuration for accessing YARN.
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 728 logIfComponentMemNotIntegerMultipleOfYarnMinAllocation - The configured JobManager memory is 1600 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 448 MB may not be used by Flink.
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 728 logIfComponentMemNotIntegerMultipleOfYarnMinAllocation - The configured TaskManager memory is 1728 MB. YARN will allocate 2048 MB to make up an integer multiple of its minimum allocation memory (1024 MB, configured via 'yarn.scheduler.minimum-allocation-mb'). The extra 320 MB may not be used by Flink.
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 605 deployInternal - Cluster specification: ClusterSpecification{masterMemoryMB=1600, taskManagerMemoryMB=1728, slotsPerTaskManager=1}
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1300 lambda$removeLocalhostBindHostSetting$9 - Removing 'localhost' Key: 'taskmanager.bind-host' , default: null (fallback keys: []) setting from effective configuration; using '0.0.0.0' instead.
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils 330 capToMinMax - The derived from fraction jvm overhead memory (160.000mb (167772162 bytes)) is less than its min value 192.000mb (201326592 bytes), min value will be used instead
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1239 startAppMaster - Submitting application master application_1676339038783_0005
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl 348 submitApplication - Submitted application application_1676339038783_0005
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1242 startAppMaster - Waiting for the cluster to be allocated
[dlink] 2023-02-15 08:39:15 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1277 startAppMaster - Deploying cluster, current state ACCEPTED
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1270 startAppMaster - YARN application has been deployed successfully.
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1866 logDetachedClusterInformation - The Flink YARN session cluster has been started in detached mode. In order to stop Flink gracefully, use the following command:
$ echo "stop" | ./bin/yarn-session.sh -id application_1676339038783_0005
If this should not be possible, then you can also kill Flink via YARN's web interface or via:
$ yarn application -kill application_1676339038783_0005
Note that killing Flink might not clean up all job artifacts and temporary files.
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.yarn.YarnClusterDescriptor 1843 setClusterEntrypointInfoToConfig - Found Web Interface tbdcm1:8085 of application 'application_1676339038783_0005'.
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.runtime.util.ZooKeeperUtils 251 startCuratorFramework - Enforcing default ACL for ZK connections
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.runtime.util.ZooKeeperUtils 257 startCuratorFramework - Using '/flink/application_1676339038783_0005' as Zookeeper namespace.
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.shaded.curator5.org.apache.curator.framework.imps.CuratorFrameworkImpl 338 start - Starting
[dlink] 2023-02-15 08:39:20 CST INFO org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.ZooKeeper 868
Anything else
No response
Version
0.7.0
Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
可能特殊字段类型转换有问题
可能特殊字段类型转换有问题
没有, 我试了, 我就一个字段 varchar2类型, 也是会报这个
'table-name' = 'TEST\.FLINK_TEST03',
'table-name' = 'TEST.FLINK_TEST03',
复制到这里, 他转了
'table-name' = 'TEST.FLINK_TEST03',
你看, 我回复你他就转了
[dlink] 2023-02-15 08:39:15 CST INFO com.dlink.trans.ddl.CreateCDCSourceOperation 165 build - A total of 0 tables were detected... Oracle元数据没有获取到表,可在IDEA里断点调试看看为什么没有表
- [X] #1673