[Proposal]: Add UUID conversion to and from 16 byte fixed sequences
UUIDs are often passed around in application code in their canonical, hex as string representation e.g. "550e8400-e29b-41d4-a716-446655440000". Encoding UUIDs as Avro "string"s takes 37 bytes, while encoding UUIDs in their binary form fits into a 16 byte sized "fixed", saving 21 bytes per encoding.
This change allows application code to keep passing around canonical hex UUIDs while converting to the compact encoding, requiring only uuid_format: :canonical_string to be given in decode options.
The Java reference implementation also supports encoding UUIDs as both strings and 16 byte fixed sequences.
-
Encoding is augmented such that a 16 byte fixed schema with
%{"logicalType" => "uuid"}, converts a hex-string UUID to the 16 byte binary representation. -
Decoding is augmented such that given
uuid_format: :canonical_stringin decode options, the binary representation is converted to the canonical hex-string representation.
The encoding change is nearly backwards-compatible, previously when given an incorrectly size "fixed" with {"logicalType": "uuid"}, an error was raised, while now conversion is attempted.
The decoding change is fully backwards-compatible, as uuid_format defaults to :binary.
For UUID codec, the uniq library was added (no transitive dependencies).