jackson-dataformats-binary icon indicating copy to clipboard operation
jackson-dataformats-binary copied to clipboard

[Avro] feature wish: write enums as string

Open jlous opened this issue 1 month ago • 3 comments

With json, I could always extend an enum (on both sender and receiver ends), consider it a backwards compatible change, and the receiver could parse both old and new documents equally well.

With avro, altering an enum makes the entire schema incompatible, and an updated schema can not be used to read old docs. For all my use cases, serialising all enums as string would be a completely acceptable strategy for avoiding this problem, and much preferable to versioned formats, but jacksons avro support does not currently seem to offer this.

So I'm hoping for a new feature switch for avro: WRITE_ENUMS_AS_STRINGS or similar.

In my case I only really care about serialising, since the receiver is on a completely different platform, but I guess parity on the deserialising side would be natural to include.

jlous avatar May 10 '24 11:05 jlous

Would this be possible wrt Avro schema limitations? Would new type be defined as Union, allowing both String and Enum? And which direction is the change? (older schema exception String, new Enum? Or vice versa)

At JsonGenerator (and so AvroGenerator subtype) level Enums are typically written using writeString() anyway (since JSON has no "Enum" type; conversion handled at databind level). Same for JsonParser/AvroParser.

So I am not 100% sure I yet understand the ask here.

cowtowncoder avatar May 10 '24 17:05 cowtowncoder

I am suggesting an option where the generated avro schema for an enum field would simply be String.

This would enable extending the enum in the future, with no change in schema.

jlous avatar May 13 '24 13:05 jlous

@jlous Ah ok. Depending on how implemented it might even be a general MapperFeature; I forget what the division is between model traversal (callbacks generated on serialization settings) and construction of Avro schema.

At this point I probably won't have time to work on this in near term but would be happy to help if anyone else wants to tackle it.

cowtowncoder avatar May 14 '24 23:05 cowtowncoder

@jlous BTW: If you change in Avro schema enum to string type, serialization and deserialization should work already.

MichalFoksa avatar May 19 '24 07:05 MichalFoksa