confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

AvroConsumer — access to key & value schemas

Open rcoup opened this issue 7 years ago • 5 comments

Description

AvroConsumer happily returns decoded Avro messages, but drops all reference to the original Avro schema/identifier. The schema is super-useful in some cases (eg. Debezium, where it contains the source database table schema).

One (relatively straightforward) solution is to:

  1. update avro.MessageSerializer.decode_message() to return a (schema_id, payload) tuple
  2. have AvroConsumer.poll() wrap the Message in a Python AvroMessage subclass which has additional .key_schema_id and .value_schema_id attributes. Could also do this in C.
  3. expose the schema registry client via a AvroConsumer.get_schema(schema_id) method

Checklist

Please provide the following information:

  • [x] confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()): 0.11.0
  • [ ] Apache Kafka broker version:
  • [ ] Client configuration: {...}
  • [ ] Operating system:
  • [ ] Provide client logs (with 'debug': '..' as necessary)
  • [ ] Provide broker log excerpts
  • [ ] Critical issue

rcoup avatar Dec 19 '17 19:12 rcoup

That makes sense.

Since 1 and 2 would be breaking changes for existing applications I guess we are stuck with 3 for the time being.

Would you care to provide a PR?

edenhill avatar Jan 03 '18 07:01 edenhill

Was all one proposed solution, not three options... (1) is the only API change, I assumed it was internal, but could always add a different method to MessageSerializer — (2) is backwards compatible?

Which bits are considered internal vs published APIs?

rcoup avatar Jan 03 '18 19:01 rcoup

Sorry, misinterpreted your suggestion.

AvroConsumer is the only public interface to worry about here; changing poll() to return an AvroMessage may be a breaking change for existing applications.

What if we start out with your suggested implementation, but prior to the next release decide to break poll() or add a second method for polling AvroMessages?

Would you care to submit a PR for this?

edenhill avatar Jan 16 '18 09:01 edenhill

I'm looking to find a way to get the schema (or the schema id) from doing AvroConsumer(); Is there a way to do this without changing the source? Tks

JimMcHale avatar Apr 27 '18 00:04 JimMcHale

is there any update on this?

adisunw avatar Feb 18 '20 22:02 adisunw