confluent-kafka-python icon indicating copy to clipboard operation
confluent-kafka-python copied to clipboard

JSONSerializer unnecessary schema validation on every call

Open liukas321 opened this issue 2 years ago • 0 comments

Description

Schema validation is coupled with object validation in JSONSerializer.call(). Every time an object is being serialized, the schema (JSONSerializer._parsed_schema) is being validated alongside the object dict validation: https://github.com/confluentinc/confluent-kafka-python/blob/baf71ea0ed54c71948208bfc5c352f4ee57054dd/src/confluent_kafka/schema_registry/json_schema.py#L267

This is due to use of jsonschema.validators.validate as the validation method, which validates schema before validating the object every time:

if cls is None:
    cls = validator_for(schema)    # Determines the best validator

cls.check_schema(schema)    # Uses MetaSchema of the validator to validate the schema
validator = cls(schema, *args, **kwargs)    # Initializes new validator
error = exceptions.best_match(validator.iter_errors(instance))    # Validates the object

As a result, JSON serialization is slow and unusable in the current state.

JSONDeserializer suffers from exactly the same problem.

How to reproduce

  1. Create an instance of JSONSerializer with any valid parameters
  2. Call it to serialize 10,000 random objects
def test():
    ctx = SerializationContext(topic=TOPIC, field=MessageField.VALUE)
    for _ in range(10_000):
        obj = DummyObject.random_obj()
        json_serializer(obj, ctx)

Checklist

Please provide the following information:

  • [X] confluent-kafka-python and librdkafka version (confluent_kafka.version() and confluent_kafka.libversion()): version: ('2.1.0', 33619968); libversion: ('2.1.0', 33620223)
  • [ ] Apache Kafka broker version: NA
  • [ ] Client configuration: {...} NA
  • [X] Operating system: Win10-x64
  • [ ] Provide client logs (with 'debug': '..' as necessary) NA
  • [ ] Provide broker log excerpts NA
  • [ ] Critical issue Not critical

liukas321 avatar May 04 '23 22:05 liukas321