Proposed roadmap for Vulcan 2.0
Summary
Here are some thoughts on how to decouple the Vulcan API from the Java Avro SDK (JAvro), opening the way to adding an alternative backend that implements Avro directly. Previously we discussed doing this by introducing our own representation of encoded avro values, so that Vulcan would convert between these and user types and backends would convert between these and Avro wire formats. Instead, I want to suggest that we convert the codecs into an algebraic datatype that can traversed by a separate interpreter to convert directly between user types and an arbitrary backend representation.
This has a few advantages:
- It avoids adding an extra layer of indirection at runtime.
- Most of the work can be done incrementally as non-breaking changes in the 1.x series, as the implementation of
Codecis invisible to users (whereas the representation of Avro values isn't.) - It reduces API surface area - we can keep the details of the Codec ADT package-private, whereas it's not clear we'd be able to do the same for a model of Avro values.
Roadmap
Changes in 1.x
- [x] Per https://github.com/fd4s/vulcan/pull/435, deprecate
Codec.instance(which is coupled directly to the JAvro API) and replace most uses of it with a few primitives and combinators. - [ ] Convert codecs to a fully introspectable algebraic datatype. Following the example of
UnionCodecin https://github.com/fd4s/vulcan/pull/435, convert all primitive codecs and combinators into named subtypes. - [ ] Refactor implementations of primitives and combinators into an interpreter of the newly introduced ADT.
encode,decodeandschemanow delegate to the interpreter. - [ ] Deprecate
encode,decode, andschemamethods onCodec, in favour if explicit use of the interpreter, to prepare for fully decouplingCodecfrom JAvro.
Changes in 2.0
- [ ] Remove
Codec.instance- all codecs must be derived from primitives we provide. - [ ] Move methods for serialization and deserialization from
Codecto live with JAvro-based interpreter. - [ ] Remove
encode,decodeandschemamethods onCodec - [ ] Consider exposing an alternative representation of schemas directly on Codec, either as a raw json string or as our own structured represenation of schemas
- [ ]
AvroTypestypes are no longer aliases for JAvro representations - instead they are either phantom types or our own model of Avro schemas - [ ]
CodecAPI is fully decoupled from JAvro - [ ] Separate the JAvro-based interpreter into a new module, remove JAvro dependency from core module
Hopefully these changes won't impact most users too much, given that the most common use case is via the integration with fs2-kafka.
Any feedback would be much appreciated!
@vlovgr any thoughts or concerns on this before i move forward with it further?
@bplommer It sounds like a solid and exciting plan. 👍 🎉