paimon icon indicating copy to clipboard operation
paimon copied to clipboard

[cdc] Optimize SyncDatabaseAction performance by removing listTables calls

Open huyuanfeng2018 opened this issue 5 months ago • 0 comments

Purpose

from #5955

What is the purpose of the change

Optimize SyncDatabaseAction performance by removing expensive listTables operations during initialization, improving scalability for databases with many tables.

Brief change log

  • Remove listTables() call from RichCdcMultiplexRecordEventParser
  • Implement lazy table creation in CdcDynamicTableParsingProcessFunction#processElement
  • Remove createdTables Set to reduce memory usage

Verifying this change

  • Verified existing functionality remains intact

Testing

This optimization does not require additional test cases as the existing functionality is already covered by:

  • SyncDatabaseActionBaseTest.testSyncTablesWithoutDbLists() - validates table filtering logic
  • SyncDatabaseActionBaseTest.testSyncTablesWithDbList() - validates database filtering logic
  • SyncDatabaseActionBaseTest.testSycTablesCrossDB() - validates cross-database filtering scenarios

All these tests create and use RichCdcMultiplexRecordEventParser, ensuring the optimization doesn't break existing functionality.

huyuanfeng2018 avatar Jul 24 '25 10:07 huyuanfeng2018