infinispan-kafka icon indicating copy to clipboard operation
infinispan-kafka copied to clipboard

Derive cache structure from incoming record schema

Open gunnarmorling opened this issue 4 years ago • 4 comments

Hi, I don't fully understand the workings around the ProtoBuf annotated classes yet (like Author in the README). Is this strictly needed? Would it be possible to derive the cache structure from the schema of the incoming SinkRecords?

gunnarmorling avatar Apr 17 '20 23:04 gunnarmorling

Using proto is not mandatory, there is an option for enabling/disabling it. Can you give me an example of "deriving the cache structure"?

oscerd avatar Apr 18 '20 10:04 oscerd

Essentially, I'm trying to figure out how this connector should be used. In particular, is it needed to have annotated classes such as Author, or not? If not, how will things work, i.e. how does the data structure look like, that's put into the cache?

Can you give me an example of "deriving the cache structure"?

What I meant is: each SinkRecord has a key schema and a value schema, which describes the structure of the key and value, respectively. E.g. the JDBC sink connector uses this schema information, to create a corresponding CREATE TABLE statement to initialize the sink database. I'm wondering, whether the same would work for Infinispan. But then I don't know how the data is structured in caches to begin with (is it just a binary BLOB perhaps?).

gunnarmorling avatar Apr 18 '20 10:04 gunnarmorling

Essentially, I'm trying to figure out how this connector should be used. In particular, is it needed to have annotated classes such as Author, or not? If not, how will things work, i.e. how does the data structure look like, that's put into the cache?

Can you give me an example of "deriving the cache structure"?

What I meant is: each SinkRecord has a key schema and a value schema, which describes the structure of the key and value, respectively. E.g. the JDBC sink connector uses this schema information, to create a corresponding CREATE TABLE statement to initialize the sink database. I'm wondering, whether the same would work for Infinispan. But then I don't know how the data is structured in caches to begin with (is it just a binary BLOB perhaps?).

Actually the component, in case you don't use the protostream stuff, will just push the Object, as it is plain and raw.

https://github.com/infinispan/infinispan-kafka/blob/master/core/src/main/java/org/infinispan/kafka/InfinispanSinkTask.java#L159

I know this should be managed better, but when I wrote this connector, it was really at the beginning of my experience with kafka and kafka-connect. So I think there is space to do the things better. Can you please open an issue for enhancing this behavior? Thank you @gunnarmorling

I'm planning of doing also the source part, but actually I don't have so much time.

oscerd avatar Apr 18 '20 10:04 oscerd

By the way having the annotated class is not mandatory.

oscerd avatar Apr 18 '20 10:04 oscerd