confluent-schema-registry icon indicating copy to clipboard operation
confluent-schema-registry copied to clipboard

Example for schema evolution?

Open krukru opened this issue 5 years ago • 5 comments

Hi there,

Looking at the docs, you reference "Compatibility and schema evolution", but I am unable to find any example on how to achieve schema evolution, which is one of the main benefits of using Avro.

For example: I would like new consumers to be able to read old producer data. Does this library support this?

For example:

V1 schema

{
    "type": "record",
    "name": "Foo",
    "fields": [
        {
            "name": "f1",
            "type": "string"
        }
    ]
}

V2 schema

{
    "type": "record",
    "name": "Foo",
    "fields": [
        {
            "name": "f1",
            "type": "string"
        },
        {
            "name": "f2",
            "type": "string",
            "default": "",
        }
    ]
}

I would like V2 consumers to have the property f2 with value "" when reading data produced with schema v1, but this does not happen, since the message is decoded using the old schema

krukru avatar Feb 17 '20 14:02 krukru

None of this is really anything to do with this library, but rather with Avro. What this library does is read the schema id from messages, fetch the corresponding schema from the schema registry if needed, and then decoding the message using that schema. Anything to do with schema evolution is purely up to how you design your Avro schemas.

In your example, in order for the schema change to be backwards compatible, the field f2 would have to be optional. Most times you would probably make it a union of null and string, but maybe using a default value of empty string works as well - I'm not sure off the top of my head, but it's easy to try by just publishing your v1 and then seeing what happens when you try to publish v2. There is a table explaining what the different compatibility modes mean here: https://docs.confluent.io/current/schema-registry/avro.html#summary

Backwards compatiblity in an Avro sense of the word means to actually use the v2 schema to read messages produced with v1. That said, what would happen in reality with this library is that you would use the v1 schema to read the old message and the v2 schema to read the new message, so in reality f2 would be undefined whenever you read a v1 message (the key wouldn't even exist).

Nevon avatar Feb 18 '20 07:02 Nevon

Hi @Nevon, thanks for the quick response! I went ahead and looked at the avsc (the lib which is used by this project to process Avro), and they mention resolvers as a way to achieve schema evolution. Looking at the Schema interface in this project, it does not expose the createResolver method, but it should exist in the Schema class. Would you consider it useful to expose createResolver in Schema?

krukru avatar Feb 18 '20 09:02 krukru

At our shop we had developed a simpler implementation of schema registry back in the day, since this project was not yet available. Its called Castle (as in "Kafka's greatest work :)).

Anyway we've recently encountered this issue ourselves, so we did implement reader schemas.

You guys are right, Avsc does support it with createResolver, and it's pretty easy to implement. I'd be happy to give it a go and try to write a PR for it here too, if I had some guidance where you'd think its best to add it.

ivank avatar Jul 08 '21 13:07 ivank

@ivank i would be really interested in this as well. @Nevon could you let us know if such a PR would be accepted and how we would best approach it? I think the flexibility of reader schemas could be a great asset.

nick-zh avatar Sep 24 '21 21:09 nick-zh

I just saw, there is already an open PR for this #137

nick-zh avatar Oct 07 '21 11:10 nick-zh