confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

OverflowError: Python int too large to convert to C int when using confluent_kafka Avro deserializer

Open AndreaBencini90 opened this issue 4 months ago • 3 comments

Description

I'm encountering an OverflowError when attempting to deserialize messages using the **confluent_kafka **Avro deserializer in Python. Here's a simplified version of my code:

class ConsumerKafka:
    def __init__(self, deserializer):
        self.deserializer = deserializer
        self.consumer = Consumer()

    def decode(self, msg_value):      
            deserialized_data = self.deserializer(msg_value, None)

self.serilizer is AvroDeserializer object from onfluent_kafka.schema_registry.avro

when i call the self.deserializer(msg_value, None)

Traceback (most recent call last):
  File "c:\Users\048115571\Documents\python\TTA\read_from_topic\modules\consumer.py", line 30, in decode
    deserialized_data = self.deserializer(msg_value, None)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\048115571\AppData\Local\Programs\Python\Python311\Lib\site-packages\confluent_kafka\schema_registry\avro.py", line 429, in __call__
    obj_dict = schemaless_reader(payload,
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "fastavro\\_read.pyx", line 1142, in fastavro._read.schemaless_reader
  File "fastavro\\_read.pyx", line 1169, in fastavro._read.schemaless_reader
  File "fastavro\\_read.pyx", line 748, in fastavro._read._read_data
  File "fastavro\\_read.pyx", line 621, in fastavro._read.read_record
  File "fastavro\\_read.pyx", line 740, in fastavro._read._read_data
  File "fastavro\\_read.pyx", line 558, in fastavro._read.read_union
  File "fastavro\\_read.pyx", line 724, in fastavro._read._read_data
  File "fastavro\\_read.pyx", line 393, in fastavro._read.read_array
  File "fastavro\\_read.pyx", line 748, in fastavro._read._read_data
  File "fastavro\\_read.pyx", line 621, in fastavro._read.read_record
  File "fastavro\\_read.pyx", line 740, in fastavro._read._read_data
  File "fastavro\\_read.pyx", line 558, in fastavro._read.read_union
  File "fastavro\\_read.pyx", line 770, in fastavro._read._read_data
  File "fastavro\\_logical_readers.pyx", line 22, in fastavro._logical_readers.read_timestamp_millis
  File "fastavro\\_logical_readers.pyx", line 24, in fastavro._logical_readers.read_timestamp_millis
OverflowError: Python int too large to convert to C int

How to reproduce

confluent_avro 1.8.0 confluent-kafka 2.3.0 fastavro 1.9.2 kafka-python 2.0.2

  • [ ] Operating system: windows Checklist ========= Please provide the following information:

  • [ ] confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()):

  • [ ] Apache Kafka broker version:

  • [ ] Client configuration: {...}

  • [ ] Operating system:

  • [ ] Provide client logs (with 'debug': '..' as necessary)

  • [ ] Provide broker log excerpts

  • [ ] Critical issue

AndreaBencini90 avatar Feb 29 '24 11:02 AndreaBencini90

Can you confirm that you are using long and not int for the type? I can see it is trying to deserialize time in milliseconds.

Can you please provide the schema information and the message?

pranavrth avatar Mar 06 '24 14:03 pranavrth

the problem happend when the cosumer try to deserialize this value -9223370327508000000 on the topic. Obviusly this is an error of the data and not of the library. Since the consumer encounters an error, it prevents consuming the rest of the message. In my opinion, it would be nice to have an option to handle these situations. The user should be able to choose whether to let it break and throw an exception or not convert the data field where a similar issue is observed.

AndreaBencini90 avatar Mar 07 '24 20:03 AndreaBencini90