flink-cdc icon indicating copy to clipboard operation
flink-cdc copied to clipboard

postgresql cdc 空指针异常与漏数据:at org.postgresql.jdbc.FieldMetadata.getSize(FieldMetadata.java:79)

Open 1032851561 opened this issue 2 years ago • 0 comments

Describe the bug(Please use English) cdc一个表时,内部代码获取列元数据时多了两条数据记录导致空指针,并且print时少了两行数据

Environment :

  • Flink version : 1.14.5
  • Flink CDC version: 2.3-snapshot
  • debezium-connector-postgres-1.6.4.Final.jar
  • Database and version: postgresql 12.4

To Reproduce 有一个表test,55个列,2万多行数据,无论是通过flink sql还是streaming api方式去读取数据并print都会报空指针异常:

org.apache.kafka.connect.errors.ConnectException: An exception occurred in the change event producer. This connector will be stopped.
	at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:42)
	at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:130)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266)
	at java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: io.debezium.DebeziumException: java.lang.NullPointerException
	at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:78)
	at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:113)
	... 6 more
Caused by: java.lang.NullPointerException
	at org.postgresql.jdbc.FieldMetadata.getSize(FieldMetadata.java:79)
	at org.postgresql.util.LruCache.put(LruCache.java:128)
	at org.postgresql.util.LruCache.putAll(LruCache.java:154)
	at org.postgresql.jdbc.PgResultSetMetaData.fetchFieldMetaData(PgResultSetMetaData.java:269)
	at org.postgresql.jdbc.PgResultSetMetaData.isAutoIncrement(PgResultSetMetaData.java:57)
	at org.postgresql.jdbc.PgResultSetMetaData.getColumnTypeName(PgResultSetMetaData.java:324)
	at io.debezium.connector.postgresql.connection.PostgresConnection.getColumnValue(PostgresConnection.java:590)
	at io.debezium.jdbc.JdbcConnection.rowToArray(JdbcConnection.java:1483)
	at io.debezium.relational.RelationalSnapshotChangeEventSource.createDataEventsForTable(RelationalSnapshotChangeEventSource.java:356)
	at io.debezium.relational.RelationalSnapshotChangeEventSource.createDataEvents(RelationalSnapshotChangeEventSource.java:307)
	at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:137)
	at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:69)

通过调试发现在方法org.postgresql.jdbc.PgResultSetMetaData#fetchFieldMetaData中获取到列元数据为57条记录,比test表(55列)多了两条记录,而这两条记录是来自该表的两行数据(刚好行记录有部分列值为null)。其中一条示例(tableOid=1033630是表记录的主键值): image

当把这两行记录null列赋值后,不再出现空指针异常(获取列元数据依然是57条),但是cdc打印的结果永远都少了上述两行记录,导致数据不能一致。

Additional Description 当我在io.debezium.connector.postgresql.connection.PostgresConnection#getColumnValue中作以下修改

line 588 : final String columnTypeName = metaData.getColumnTypeName(columnIndex);
修改为:final String columnTypeName =( (PgResultSetMetaData) metaData).getPGType(columnIndex-1);

现象好像变得正常了,但是我不确定问题是否可以彻底解决,我怀疑这是由于并发引起PgStatement中ResultSet的数据混乱问题。

1032851561 avatar Aug 15 '22 08:08 1032851561