ksql icon indicating copy to clipboard operation
ksql copied to clipboard

Cannot stream results to a stream or table that uses a referenced schema which contains internal schema references

Open rodesai opened this issue 2 years ago • 2 comments

ksqlDB supports specifying a schema ID in the WITH clause of a DDL/DML. However, if the Schema referenced by the schema ID is a protobuf schema that references other schemas, then ksqlDB fails to write any results to that stream or table.

This is because when we load the schema in the serializer, the schema registry converters will resolve all internal references and build a fully resolved connect schema. Then, when we try to use the serializer to write a record, we will pass the record along with the fully resolved connect schema to the sr serializer. The sr serializer compares this against the schema registry schema and barfs because it thinks they are not compatible.

The resulting processing log message looks something like:

"CAUSE": [
        "Failed to serialize Protobuf data from topic <topic name> :",
        "Error serializing Protobuf message",
        "Incompatible schema syntax = <schema dump>"
        ...

We have a couple approaches we could take to resolve this:

  1. The sr serializer supports specifying a config like id.compatibility.strict=false which will disable this client-side check. What I'm not sure of is whether the resulting record will actually be deserializable - as its serialized using a fully resolved config but references a config with internal references. We would need to test this out.
  2. We could add support for schema references in connect schema

rodesai avatar Jun 03 '22 09:06 rodesai

Seems this is fixed by the following PRs: https://github.com/confluentinc/ksql/pull/8933 https://github.com/confluentinc/ksql/pull/9047 https://github.com/confluentinc/ksql/pull/8984

When a user selects a schema ID that contains multiple Protobuf schema definitions, then the user must specify the full schema name with VALUE_SCHEMA_FULL_NAME in order to deserialize and serialize that Proto schema values correctly.

spena avatar Jun 03 '22 15:06 spena

Hello!

Could you please remove the name of the company and change a little bit the schema example?

It's a private information that shouldn't be exposed on a public Github.

odilonjk avatar Jun 21 '22 15:06 odilonjk