cdc-apache-cassandra icon indicating copy to clipboard operation
cdc-apache-cassandra copied to clipboard

[Source][Utilization] Enable processing multiple C* in a single source instance

Open aymkhalil opened this issue 2 years ago • 1 comments

Today, the C* source connectors only allows 1:1 between tables and sinks. In order to increate the utilization of the underling resources associated with a single source instance (e.g. Memory footprint a single sink is ~500MB, which does not scale well if the user has 10s or 100s of tables), the proposal is to enable users to configure multiple tables in their source config.

Proposed source config:

configs:
  contactPoints": "localhost",
  loadBalancing.localDc": "Cassandra" , "outputFormat": "key-value-avro"
  tables:
    ks1:
      table1:
        events.topic": "persistent://public/default/events-ks1.table1"
        data.topic": "persistent://public/default/data-ks1.table1"
    ks2:
      table2:
        events.topic": "persistent://public/default/events-ks2.table2"
        data.topic": "persistent://public/default/data-ks2.table2"

aymkhalil avatar Oct 24 '22 23:10 aymkhalil

Alternatively, we can keep the config as close as possible to today's single table configs by replacing data.topic with destination-topic-name

aymkhalil avatar Oct 24 '22 23:10 aymkhalil