avro icon indicating copy to clipboard operation
avro copied to clipboard

AVRO-3135: ability to hook into schema ser/deser to implement schema refer…

Open zolyfarkas opened this issue 3 years ago • 8 comments

For details on the rationale of this change please see

PR adds the following unit test: TestSchemaSerializationHooks which should also be a good example for how the functionality can be used.

zolyfarkas avatar May 11 '21 11:05 zolyfarkas

This would appear to entangle the parsing of the schema with an infrastructure dependency.

For example, two different processes were parsing the same document, and did not have the same reference resolver, or resolved the schema to different things, the actual schema would become non-deterministic.

adamkennedy avatar Oct 07 '21 23:10 adamkennedy

What is the purpose of the added file lang/s110/java/15/classes/org/apache/avro/specific/test/FullRecordV2$1.class ?

martin-g avatar Oct 08 '21 05:10 martin-g

What is the purpose of the added file lang/s110/java/15/classes/org/apache/avro/specific/test/FullRecordV2$1.class ?

that was unintended.

zolyfarkas avatar Oct 20 '21 22:10 zolyfarkas

This would appear to entangle the parsing of the schema with an infrastructure dependency.

For example, two different processes were parsing the same document, and did not have the same reference resolver, or resolved the schema to different things, the actual schema would become non-deterministic.

@adamkennedy this PR adds the ability to hook in a method to resolve a reference.

It is up to the library user to chose to use a resolver or not.

Although different resolvers could resolve a schemas differently. I have not seen this happening in practice in my implementations. (Always used references pointing to immutable content)

One could bring the same argument against logical types. you can have different conversions/implementations registered in different places and non-deterministic behavior.

zolyfarkas avatar Oct 20 '21 22:10 zolyfarkas

What is the purpose of the added file lang/s110/java/15/classes/org/apache/avro/specific/test/FullRecordV2$1.class ?

that was unintended.

removed

zolyfarkas avatar Oct 20 '21 23:10 zolyfarkas

The thing I like about current .avsc files is that they are complete. Using external references breaks that, and would require some form of compilation/resolution to name the schema complete again.

Instead, one can also use .avdl (IDL) files: these support resolving imports from both

  • the file system (splitting large schemata in multiple files) and
  • the class path (allowing your dependency system to import schemata by version)

opwvhk avatar Nov 16 '21 07:11 opwvhk

The thing I like about current .avsc files is that they are complete. Using external references breaks that, and would require some form of compilation/resolution to name the schema complete again.

Instead, one can also use .avdl (IDL) files: these support resolving imports from both

  • the file system (splitting large schemata in multiple files) and
  • the class path (allowing your dependency system to import schemata by version)

avsc and avdl are not equivalent, avsc is a data format for schemas while avdl is a format for interfaces/protocols. avdl is not serialization/deserialization friendly. but it's json representation .avpr is.

I understand your point. What about this change enabling a new schema format? let's call it ".ravsc". .avsc remains references free .ravsc introduces references support...

Think about this PR about enabling the ability to implement .ravsc ....

to understand why I think this is worthwhile, see the use cases I described at.

zolyfarkas avatar Jan 14 '22 14:01 zolyfarkas

Just a question about proposed code, Why put methods customRead & customWrite in Parser.Names class (which is a simple registry of known schemas), and not directly in Parser class, or, even better, in a new Interface as CustomSerializer as

interface CustomSerializer {
  Schema customRead(Function<String, JsonNode> object);
  boolean customWrite(Schema schema, JsonGenerator gen) throws IOException;
}
...
public Parser(CustomSerializer cs) {
      this.cs = cs;
}

clesaec avatar Jun 01 '22 15:06 clesaec

Github seems to mess up this PR... will create a new one

zolyfarkas avatar Mar 02 '24 21:03 zolyfarkas