confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

Fix type hinting of avro messages

Open DrPyser opened this issue 7 years ago • 3 comments

Description

I use Pycharm as my IDE, and I dislike seeing complaints about type mismatch. The value attribute of Message objects is typed as Optional[Union[str, bytes]]. However, AvroConsumers set that value to the deserialized message, i.e. whatever python datatype match the avro schema(most often, a dict). This generates red flags for any type checkers when I treat that value as a dict(or whatever I expect the deserialized message to be).

Not sure what's the best way to change the type hinting when using C bindings.

Edit: Also, Pycharm thinks Message.value takes a payload argument. Not sure why that is.

How to reproduce

e.g.

consumer = AvroConsumer(...)
message = consumer.poll()
field = message.value().get("field") # Pycharm highlights this as an error

Checklist

Please provide the following information:

  • [x] confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()): confluent_kafka.version() = ('0.11.5', 722176), confluent_kafka.libversion() = ('0.11.5', 722431)
  • [ ] Apache Kafka broker version: N/A
  • [ ] Client configuration: N/A
  • [ ] Operating system: N/A
  • [ ] Provide client logs (with 'debug': '..' as necessary)
  • [ ] Provide broker log excerpts
  • [ ] Critical issue

DrPyser avatar Oct 11 '18 19:10 DrPyser

Also, not sure if I should create separate issues, but there are other type hinting problems.

  • The doc string for the AvroProducer.__init__ says the arguments default_key_schema and default_value_schema are strings(str), but it seems they are actually supposed to be(or at least can be) avro.Schema objects, such as obtained from confluent_kafka.avro.load.

DrPyser avatar Oct 11 '18 22:10 DrPyser

I'm not 100% sure how we would fix this for AvroConsumer's since the type won't be known until runtime. As you mentioned Avro will deserialize the contents of the message which are infact a byte sequence into the type as it were defined in the writer's schema.

As for your comment about Avro[Producer|Consumer].init I agree, we should change the type to be schema as opposed to str.

rnpridgeon avatar Oct 15 '18 12:10 rnpridgeon

Hmm, I was mistakenly thinking that messages would always be dict, but yeah, they can also be scalar types, or arrays.

At the very least, you can type it as typing.Any, which would make it type check. With a bit more effort, you can also define the type as a Union of all possible avro types, e.g.

AvroValue = Union[str, int, bytes, Dict[str, Any], List[Any], ...] # not sure if i'm missing types?
class AvroMessage:
    def value(self) -> AvroValue: ...

Python typing/mypy doesn't support recursive types yet, sadly.

DrPyser avatar Oct 16 '18 02:10 DrPyser