jackson-dataformats-binary
jackson-dataformats-binary copied to clipboard
[avro] Add support for reading schema from Avro-encoded file
(moved from https://github.com/FasterXML/jackson-dataformat-avro/issues/10)
Avro streams may include embedded schema, and since it should be relatively safe to either auto-detect it; or just configure this to be the default if no schema is specified, we should support this mode.
As to sample data, maybe this project:
https://github.com/miguno/avro-cli-examples
has data we could use for confirming proper usage.
A follow-up feature should probably be that of producing & embedded schema; but that'd be a separate RFE.
Is this ticket resolved? Noticed it's referenced in the chery-pick commit
@bkenned4 No; may have accidentally included issue id of the old repo.
As to implementation I suspect auto-detection may be slightly risky (it is possible to have encoded data start with same 4 bytes). But as long as it's format feature, disabled by default, may make sense. In addition to forcing use
@cowtowncoder clear. thanks for the context
@cowtowncoder what's the status of this issue? This'd be 100% useful in a number of cases. For example, another part of my system is generating AVRO documents, and I know for sure that the schema is present, so at least a possibility of a manual schema detection would be nice!
@cowtowncoder given the last comment was few years ago, I'm not sure where this issue stands, but it looks like it still open. I'm currently writing a spring boot application to consume multiple files in different formats (xml, csv, avro) and this would help a lot.with keeping code clean and easy to follow. Thank you
At this point I do not have time to work on this feature, even though I fully agree that this would be a great feature.
However: if someone has the itch and would like to try to produce a PR, I will find time to help getting PR refined and hopefully merged. At this point such contribution could make it to upcoming 2.14.0 and earn kudos for a really, really nice addition from happy users. :)
Also: one thing that can help motivate others is to "up vote" issue with "thumbs up" reaction. While that does not change anyone's availability, sometimes it can help prioritize things nonetheless.
An additional idea: if you don't think you know how to tackle somewhat advanced feature like this one (it's not trivial to figure out where and how to plug it in if not familiar with the project, at least), one thing that would be helpful is simply a unit test: writing test that tries to read input file that contains embedded Schema, using default parser/mapper with no extra settings -- and would currently fail.
Feature implementor would only need to modify test lightly to enable schema-reading (I think a AvroParser.Feature
is needed since due zero-redundancy it is not possible 100% reliably detect that content starts with Schema, I think) but could use it as verification of feature functioning.