kafka-connect-bigquery icon indicating copy to clipboard operation
kafka-connect-bigquery copied to clipboard

Support topic with multiple schemas

Open jaroslawZawila opened this issue 5 years ago • 6 comments

I am trying to use this connector with the topic with multiple schemas. The problem is the subject name for schema is topic-record where this code is hard coded to topic-value. Are you going to support that kind of scenarios?

jaroslawZawila avatar Jul 16 '19 10:07 jaroslawZawila

@mtagle If I would make the changes to add support for above , are you going to accept it?

jaroslawZawila avatar Oct 09 '19 09:10 jaroslawZawila

I'm also ready to put some time & effort into preparing a pull request. It would be great to know that you @mtagle and @criccomini are not going to reject the whole idea.

mkubala avatar Oct 28 '19 10:10 mkubala

I have to admit I'm a little unclear about what you mean by "a topic with multiple schemas". I'm particularly unclear on how a topic with multiple schemas would successfully end up in a table in BQ with (presumably) a single fixed schema. Could you elaborate on this? Maybe some details aobut your use-case would be helpful as well.

mtagle avatar Oct 28 '19 22:10 mtagle

In our use case we are using an event sourcing approach where a single topic contains many different entities. We would like to store those entities in a dedicated BigQuery tables (one table for each event type) for further data ingestion / BI.

This approach is already supported by Kafka and Schema Registry (see https://www.confluent.io/blog/put-several-event-types-kafka-topic/).

We would add two new configuration options (names can be changed, this is just only draft):

  • recordNamesToTables - A list of comma-delimited strings representing mapping between record names and table names (just like topicsToTables but for records). Default: empty list.
  • multipleSchemaTopics - A boolean feature switch. Default: false.

Table names would contain mapped topic and record name (separated by _).

What do you think about the whole idea?

mkubala avatar Oct 29 '19 10:10 mkubala

Thank you for the clarification, this makes a lot more sense to me now!

As the schema-registry already supports this, I'm in support of adding this feature to KCBQ. We'd be happy to review and merge your changes if you wanted to implement this.

mtagle avatar Oct 29 '19 19:10 mtagle

Great! Expect a PR soon :)

mkubala avatar Oct 29 '19 21:10 mkubala