paimon
paimon copied to clipboard
[Bug] Incorrectly including tables matching excludingTablePattern in combined mode cdc
Search before asking
- [X] I searched in the issues and found nothing similar.
Paimon version
master
Compute Engine
flink
Minimal reproduce step
- Create a database named cdc_test in mysql, then create primary table named pk_1, pk_2 ... pk_100 and non-primary table named non_pk_1, non_pk_2, ... non_pk_100.
- Start a combined mode mysql database cdc job, and set excludingTablePattern to 'non_pk_.+'.
- Concurently create non-primary table named non_pk_101 when starting cdc job.
- Finally, the Jobmanager log will show "com.ververica.cdc.connectors.mysql.source.utils.TableDiscoveryUtils [] - including ‘cdc_test.non_pk_101’ for further processing".
What doesn't meet your expectations?
non_pk_101 obviously matches the excludingTablePattern and needs to be excluded. Since non_pk_101 is a non-primary key table, MySqlChunkSplitter will report an error: Caused by: org.apache.flink.table.api.ValidationException: Incremental snapshot for tables requires primary key, but table cdc_test.non_pk_101 doesn't have primary key.
combinedModeTableList func uses excluding pattern ?!(^db\.tbl$)|(^...$),and this will miss the newly table that matches excludingTablePattern and created when starting cdc job.
Anything else?
No response
Are you willing to submit a PR?
- [X] I'm willing to submit a PR!