avro icon indicating copy to clipboard operation
avro copied to clipboard

AVRO-1759: [java] Automatic union types for sealed classes

Open ashley-taylor opened this issue 5 months ago • 3 comments

What is the purpose of the change

This pull request eliminates the need to manually configure the @Union annotation when using sealed classes. This should increase the times that Avro serialisation just works without needing to consult the documentation to address polymorphic types.

I feel this is the best solution for AVRO-1759. Another approach would be to use a similar method to the reflections library. I feel that would be a step too far for this library.

It also addresses AVRO-1568 if third-party libraries start adding sealed classes as well.

Using reflection to read the sealed methods keeps this code compatible with Java 11.

Verifying this change

This change added tests and can be verified as follows: Added new tests in the java17-test module to verify behaviour.

Documentation

  • Does this pull request introduce a new feature? (yes)
  • If yes, how is the feature documented? Built-in Java 17+ language feature now works without manual configuration.

ashley-taylor avatar Jul 19 '25 10:07 ashley-taylor

@martin-g @opwvhk Another PR. Not related to my goal to add record support, but an unrelated, simpler Java 17+ feature

Thanks

ashley-taylor avatar Jul 19 '25 10:07 ashley-taylor

@opwvhk got ahead of myself. Will endeavour to run them with the dispatch command going forward before opening the PR. Ready for review now

ashley-taylor avatar Jul 28 '25 05:07 ashley-taylor

I don't use Avro in Java myself, but I wonder if this should offer an opt-out. Maybe the developer has a class hierarchy only for selecting different in-memory representations and does not intend to encode it as an Avro union. I mean something similar to the .NET class BufferedLogRecord, which implements many of its properties as returning null by default, and derived classes override only those properties for which they are able to return a different value. So when the log buffering implementation gets a log entry and translates that to a BufferedLogRecord, it can choose a derived class at run time, depending on which properties it has been configured to save and whether the actual log entry has non-null values for those properties.

(A log entry in .NET may contain references to objects that will be recycled for other purposes after the logging call returns, but before the log buffer is flushed. That's why the log entry cannot be stored in a log buffer as is.)

OTOH, BufferedLogRecord is an abstract class, and I imagine a similar thing in Java would likewise be abstract and could not be automatically deserialized from a non-union Avro object, so perhaps types like that are unlikely to be used as Avro data.

KalleOlaviNiemitalo avatar Nov 10 '25 11:11 KalleOlaviNiemitalo