kop icon indicating copy to clipboard operation
kop copied to clipboard

[FEATURE] Automatically encode Kafka entries using Confluent Avro Serialiser format

Open eolivelli opened this issue 3 years ago • 1 comments

Is your feature request related to a problem? Please describe. When I consume data from a Pulsar topic with AVRO or KeyValue<AVRO,AVRO> format using KOP I cannot leverage the Schema information. It would be great is KOP could pack the Kafka payload using the conventions of Confluent Kafka Serializers (https://docs.confluent.io/platform/current/schema-registry/serdes-develop/serdes-avro.html)

Describe the solution you'd like Add a new entry format "pulsar-schema-aware", like "Pulsar", but in the consumer flow we look for a Schema version and in case of the presence of a AVRO schema we:

  • register the schema on a Schema Registry (we need a URL and auth credentials)
  • inject the schema id in the payload

We have to handle both AVRO and KeyValue schemas

This way Kafka users can use Kafka Connect and other tools to read from Pulsar topics.

eolivelli avatar Apr 20 '22 10:04 eolivelli

Now that the SchemaRegistry landed to master branch I will put up a proposal for the Confluent Kafka Serializers for AVRO on the Consumer (Fetch) path.

My idea is to automatically register the Schema in the Kafka SchemaRegistry while fetching using the Kafka Client and encode the payload in a way that the Confluent Kafka AVRO Deserialiser is able to decode the data

eolivelli avatar Jun 06 '22 14:06 eolivelli