cdc-apache-cassandra
cdc-apache-cassandra copied to clipboard
[Source][Utilization] Enable processing multiple C* in a single source instance
Today, the C* source connectors only allows 1:1 between tables and sinks. In order to increate the utilization of the underling resources associated with a single source instance (e.g. Memory footprint a single sink is ~500MB, which does not scale well if the user has 10s or 100s of tables), the proposal is to enable users to configure multiple tables in their source config.
Proposed source config:
configs:
contactPoints": "localhost",
loadBalancing.localDc": "Cassandra" , "outputFormat": "key-value-avro"
tables:
ks1:
table1:
events.topic": "persistent://public/default/events-ks1.table1"
data.topic": "persistent://public/default/data-ks1.table1"
ks2:
table2:
events.topic": "persistent://public/default/events-ks2.table2"
data.topic": "persistent://public/default/data-ks2.table2"
Alternatively, we can keep the config as close as possible to today's single table configs by replacing data.topic
with destination-topic-name