Support dynamically table addition in flink-cdc-base
Currently flink-cdc-base framework doesn't support discovering and adding tables dynamically. This feature is already implemented in MySQL CDC connector so it's required to support it in the framework in order to adapt MySQL CDC connector to flink-cdc-base.
please assign this task to me if no one accepts this task. I am happy to accept this assignment @PatrickRen
Hey @PatrickRen @leonardBang, do you know if anyone is actively working on this?
I believe this is needed for https://github.com/ververica/flink-cdc-connectors/issues/1163 (we'd like to ship incremental support in Postgres with scan.newly-added-table.enabled feature). I'd love to be assigned, I've already started working on this.
Hello, @molsionmo Are you still working for this ?
Could you reassign this to me? I created a PR for this: https://github.com/ververica/flink-cdc-connectors/pull/1838
@leonardBang I'm sorry for not replying in time. I developed part of the work before and didn't have time to submit PR. I submitted that PR separately and compared the work content of sap1ens with many similar parts. Thank you @sap1ens for your excellent work.
My PR section just includes the Support dynamically table addition in Flinks-CDC-base. If sap1ens PR is adopted, I will close my pr and participate in the review and test work.
Can anyone look at my PR again?
Can anyone look at my PR again?
Hey, @sap1ens Jiabao is helping to review the PR, but recently we're busy on 2.4 version code freeze, so may be the review work would be continued later. And the PR is a huge enhancement and I'd like to put it to next version as it's close to code freeze date. WDY?
Sure, just wanted to remind before the 2.4 release, but it looks like it's too late :) No worries.
Hope to be able to merge to version 2.4
Hi, just wanted to remind you about the PR again, thanks!
Could this be considered for 3.1? I can look into rebasing the PR if needed, assuming it'll get the attention.
@sap1ens I added this to 3.1 roadmap. @molsionmo Do you time to finish this in 3.1 version? we can find someone to finish this task if you are busy in your company business.
I'll take a look at the PR tomorrow and let you know. Thanks!
@leonardBang I've updated the PR: https://github.com/ververica/flink-cdc-connectors/pull/1838. However, a lot of things have changed since December 2022 🙂. I found several PRs with changes for this feature, including a very large one.
What's your guidance here?
Should we copy the latest implementation of the Scan Newly Added Tables feature? It'll probably take me several days to accommodate new changes + there is more testing needed. But it may make sense to do it if you think that the existing implementation in MySQL is significantly better (the non-blocking reads are great).
On the other hand, if the current PR is good enough I can quickly add support for Postgres after that and it's already well-tested in prod (we've been running it in prod for about a year).
Hey, @sap1ens thanks for your updating. I think we should copy the latest implementation which is better than before, and we can wait this feature in 3.1 release, we have enough time to finish this in 3.1 version development circle. WDYT?
@sap1ens, it seems that unblocking the process for newly added tables is a better approach, and I am also interested in PG CDC and have enough time recently. I would like to collaborate with you, for instance, I can help implement certain functionalities or review and provide feedback on your Pull Requests. By the way, my PR Add SNAPSHOT_ONLY mode for Incremental CDC Source may have influnce on it(because both will stop the stream split for difference purpose), so I will complete it this week without blocking this PR.
@leonardBang I've attempted to apply new updates from the PRs I identified, but, unfortunately, it's just too much work at the moment for me, I only have a few working days left in the year. I'm also not sure that this list of PRs is complete. Likely it's not and copying changes requires comparing all relevant files one-by-one.
But I do believe it's an important change and waiting longer will increase the difference between the connectors even more. So I'd appreciate any help here, FYI @loserwang1024.
Once the cdc-base is updated, I'm happy to contribute Postgres-specific changes and tests.
@leonardBang , I'd like to do it. @sap1ens, thanks a lot , being able to reference your past work will help me avoid a lot of trouble.
Closing this issue because it was created before version 2.3.0 (2022-11-10). Please try the latest version of Flink CDC to see if the issue has been resolved. If the issue is still valid, kindly report it on Apache Jira under project Flink with component tag Flink CDC. Thank you!
Actually, this was implemented in https://github.com/ververica/flink-cdc-connectors/pull/3024