dinky icon indicating copy to clipboard operation
dinky copied to clipboard

[Feature][CDCSOURCE] Supports filter data

Open stdnt-xiao opened this issue 1 year ago • 2 comments

Search before asking

  • [x] I had searched in the issues and found no similar feature requirement.

Description

Expect to support data filtering to achieve bidirectional synchronization of multi-center databases. Use filtering functionality to prevent data loops. like: https://www.confluent.io/blog/sync-databases-and-remove-silos-with-kafka-cdc/

My current idea is to improve the parameters in 'flink-kafka-connector' and filter the data based on 'sink.filter.pattern' rules in the 'write' function.

Have any better suggestions? thx.

EXECUTE CDCSOURCE jobname WITH ( 'connector' = 'mysql-cdc', 'hostname' = '127.0.0.1', 'port' = '3306', 'username' = 'dlink', 'password' = 'dlink', 'checkpoint' = '3000', 'scan.startup.mode' = 'initial', 'parallelism' = '1', 'table-name' = 'test.student,test.score', 'sink.connector'='datastream-kafka', 'sink.brokers'='127.0.0.1:9092', 'sink.filter.pattern': '$[?(@.SRC == "SQLSRV")]',
'sink.filter.model": 'exclude' )

Use case

No response

Related issues

No response

Are you willing to submit a PR?

  • [X] Yes I am willing to submit a PR!

Code of Conduct

stdnt-xiao avatar Oct 14 '23 07:10 stdnt-xiao

Worth a try.

aiwenmo avatar Oct 15 '23 14:10 aiwenmo

Hello, this issue has not been active for more than 30 days. This issue will be closed in 7 days if there is no response. If you have any questions, you can comment and reply.

你好, 这个 issue 30 天内没有活跃,7 天后将关闭,如需回复,可以评论回复。

github-actions[bot] avatar Feb 01 '24 00:02 github-actions[bot]