jackson-dataformats-binary icon indicating copy to clipboard operation
jackson-dataformats-binary copied to clipboard

[Avro] Incompatibility with Avro >=1.9.0

Open Sage-Pierce opened this issue 6 years ago • 39 comments

Avro version 1.9.0 was released last month (https://mvnrepository.com/artifact/org.apache.avro/avro) and my team has run in to compatibility issues between jackson-dataformat-avro and this latest release.

In particular, it looks to be the case that Avro has switched from Codehaus Jackson to FasterXML Jackson, which results in NoSuchMethodError thrown when accessing utility methods.

The following test exposes the issue:

    @Test
    public void beanShouldBeSerializableAsAvroBinaryAndDeserializable() throws Exception {
        TestBean testBean = new TestBean();
        testBean.setData("Data");

        AvroMapper avroMapper = new AvroMapper();

        AvroSchema avroSchema = avroMapper.schemaFor(TestBean.class);

        ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
        avroMapper.writer().with(avroSchema).writeValue(outputStream, testBean);

        byte[] serialized = outputStream.toByteArray();

        assertNotNull(serialized);
        assertTrue(serialized.length > 0);

        TestBean deserialized = avroMapper.readerFor(TestBean.class).with(avroSchema).readValue(serialized);

        assertEquals(testBean, deserialized);
    }

    public static final class TestBean {

        private String data;

        @Override
        public boolean equals(Object o) {
            if (this == o)
                return true;
            if (o == null || getClass() != o.getClass())
                return false;
            TestBean testBean = (TestBean) o;
            return Objects.equals(data, testBean.data);
        }

        @Override
        public int hashCode() {
            return Objects.hash(data);
        }

        public String getData() {
            return data;
        }

        public void setData(String data) {
            this.data = data;
        }
    }

This test passes on v1.8.2 of Avro, but fails with v1.9.0 on the following:

java.lang.NoSuchMethodError: org.apache.avro.util.internal.JacksonUtils.toObject(Lorg/codehaus/jackson/JsonNode;)Ljava/lang/Object;

	at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.schemaFieldForWriter(RecordVisitor.java:192)
	at com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor.optionalProperty(RecordVisitor.java:121)
	at com.fasterxml.jackson.databind.ser.BeanPropertyWriter.depositSchemaProperty(BeanPropertyWriter.java:839)
	at com.fasterxml.jackson.databind.ser.std.BeanSerializerBase.acceptJsonFormatVisitor(BeanSerializerBase.java:861)
	at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.acceptJsonFormatVisitor(DefaultSerializerProvider.java:566)
	at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:3874)
	at com.fasterxml.jackson.databind.ObjectMapper.acceptJsonFormatVisitor(ObjectMapper.java:3853)
	at com.fasterxml.jackson.dataformat.avro.AvroMapper.schemaFor(AvroMapper.java:96)

Note that we are using com.fasterxml.jackson.dataformat:jackson-dataformat-avro:2.9.9

Sage-Pierce avatar Jun 26 '19 23:06 Sage-Pierce

Hmmh. So the problem is that API of Apache Avro library is incompatible between 1.8.x and 1.9 -- one exposes Jackson 1.x JsonNode (1.8 and before), latter Jackson 2.x JsonNode. This is major incompatibility and something that Jackson Avro module probably can do nothing about, in the sense that if you use Jackson Avro Module 2.9 or earlier, you MUST use Apache Avro module version 1.8.x. I can change this for Jackson 2.10 to upgrade to require 1.9.0 (with similar limitation that earlier versions can not be used).

Thank you for reporting the issue: I think it is good that apache avro library has upgraded to Jackson 2.x, but it is bit problematic as types are exposed in API on short term.

cowtowncoder avatar Jul 02 '19 23:07 cowtowncoder

As of Apache Avro 1.9.0 they updated to depend on Jackson 2.9.7, but the problem is actually in the com.fasterxml.jackson.dataformat.avro.schema.RecordVisitor class which is depending on "org.apache.avro.util.internal.JacksonUtils", which used to reference the old org.codehaus.jackson.JsonNode. The problem is the Jackson RecordVisitor class should not be relying on an 'internal' API from org.apache.avro.util.internal, and certainly should not be referring to classes from old versions of Jackson.

salt-tony avatar Jul 22 '19 06:07 salt-tony

@salt-tony yes, I understand the mechanism pretty well and attempted to summarize above. But I am not sure of the point you are trying to make here?

My interest is not in figuring out what to blame (I agree that relying on internal parts of another library is not a good idea) but to figure out how to try to contain further version incompatibilities. Since Jackson versions up to 2.9.x will be bound to dependencies of Avro lib 1.8.x, it is not possible to upgrade to later Avro library there; and anything using Jackson 2.9.x Avro module can not use apache Avro 1.9.x or above. But there is further problem that if with 2.10 we upgrade dependency here, it will then require upgrade of apache avro library for any mixed use cases. Alternatively it would be possible keep status quo for Jackson 2.x and not upgrade to apache avro 1.9.x and beyond; and wait for Jackson 3.0 to make the jump.

Note, too, that Avro module is only relying on Jackson 1.x types because Avro 1.8.x exposes default values using 1.x JsonNode values; and internal JacksonUtils class seems to have been referenced for necessary conversions (I did not write code in question and failed to notice reference on code review); but even without it, reference to 1.x types is unavoidable before new version.

cowtowncoder avatar Jul 23 '19 00:07 cowtowncoder

After investigating this for a bit now I am much less hopeful in figuring out a working way to upgrade Avro module to work with Apache Avro lib 1.9.x. Part of the problem is that the complicated way some of unit tests work, it is difficult to see what actually breaks: I can get code to compile fine, but many things do not seem to work the way they used to in avro lib.

So there is a distinct possibility that Avro module will require pre-1.9 version of Apache Avro lib, at least for 2.10.

cowtowncoder avatar Jul 24 '19 05:07 cowtowncoder

Ohhhh man Avro has gone and created a right nasty mess. I've done a bit of digging, and there's some good news and some bad news:

  1. Good: jackson-dataformat-avro can be made binary compatible with both 1.8.X and 1.9.X without too much work. We were leveraging some Avro helper utils to avoid duplicating the code and increase the likelihood of remaining api-compatible as things evolved. These just need to be replaced with fresh code that lives in jackson-dataformat-avro. Slightly increases the risk of api-compatibility (specifically around how default values are encoded and interpreted in schemas), but all of the regression tests still pass with 1.8.X and I expect the use-cases to be limited to a domain of types that shouldn't have compatability issues.

  2. Bad: Avro changed the way they encode type information in schemas. In 1.8.X, nested classes were encoded as

     {"type":"record","namespace":"package.OuterClass$","name":"InnerClass", ....}
    

    Which was really helpful, because you could tell the difference between a nested class and a top-level class, and that made runtime type discovery relatively easy because these could be unambiguously concatenated and then fed into Class.forName() to find the type during deserialization. However, with 1.9.X, they've started serializing nested classes sans the trailing $ in the namespace:

     {"type":"record","namespace":"package.OuterClass","name":"InnerClass", ....}
    

    Which we can no longer simply feed into Class.forName(), and now have to do a "Guess and check" by probing both the non-nested and nested combinations of namespace+name. This makes 1.9.X not forwards compatible with 1.8.X over the wire.

The unit tests catch this pretty handily (all of the "Apache(1.9.X) -> Jackson w/ Apache(1.9.X) schema" round-trip tests start failing, while their counterparts do not), and once fixed reveal a handful of other edge cases that need to be looked at. It appears that most of the other failing unit tests are similarly related to this issue, in the forms of Enums not resolving correctly or Unions not resolving correctly because we had relied upon Avro helpers to resolve these situations in a compatible manner; It looks like they will have to be rewritten and kept in sync instead.

In general, I recommend staying away from 1.9.X unless you're prepared to upgrade your whole ecosystem (or at least every consumer) all at once because the schema format is not forwards compatible. I think these can all be worked around/fixed/patched so that Jackson can read from both versions and write from both versions without any major overhauls/work, but we do need to clone&own some pieces we were leveraging from Avro.

baharclerode avatar Jul 26 '19 21:07 baharclerode

@baharclerode Thank you for digging bit deeper! Your comments make more sense to me, and I think that at least for 2.10 we'll better stay on 1.8.x. Even if it means we can't get rid of Jackson 1.x dependency.

I would like to see the minor cleanup wrt replacing use of internal.JacksonUtils, as that makes needed changes slightly simpler.

cowtowncoder avatar Jul 26 '19 22:07 cowtowncoder

I'll see if I can get two PRs together:

  1. Remove direct dependency on Jackson 1.X / internal.JacksonUtils
  2. Support deserializing from Apache 1.9.X schemas

The irony is I'm pretty sure this behavior was changed because the $ in namespaces isn't allowed by the spec, but the java implementation used it accidentally and it was never caught because until now, namespaces were not validated by the java implementation (AVRO-2143); However, the java implementation will still generate invalid namespaces if you doubly-nest a class, i.e.:

package example;
public class Foo {
    public static class Bar {
        public static class Baz {}
    }
}

will still have a name of Baz and a namespace of example.Foo$Bar since they only fixed it for one level deep and didn't handle anything beyond that.

baharclerode avatar Jul 26 '19 23:07 baharclerode

Heh. Oh boy. But yes, +1 for those PRs. I'll also need to find time to work on other submitted PRs, bugs against Avro module, to get things shipshape for 2.10.

cowtowncoder avatar Jul 26 '19 23:07 cowtowncoder

Hi cowtowncoder are you working on this?

Avro 1.8.x has vulnerabilities and we need to update avro or remove jackson-dataformats-binary from our projects. Is there any possibility to support 1.9.x in short term?

dani-e-l avatar Nov 06 '19 13:11 dani-e-l

No, I do not plan to work on this. Avro 1.9.x seems fundamentally problematic wrt its dependencies.

cowtowncoder avatar Nov 06 '19 17:11 cowtowncoder

@DanielCharczynski We fixed this vulnerability by using tech.allegro.schema.json2avro converter version 0.2.9

ZachManno avatar Nov 08 '19 16:11 ZachManno

Could you check if this is still the case with Apache Avro 1.9.2? Would love to help here.

Fokko avatar Feb 19 '20 08:02 Fokko

Hello! A couple of quick points:

  1. Name validation can be turned off in Schema.Parser. It's probably not great to use non-standard names across platforms/languages, but it's possible.
  2. The schema can store other metadata. The java-class attribute is probably a good bet to store the Java class name!
  3. There shouldn't be any problem with forward/backward compatibility strictly "over-the-wire" (I sincerely hope!) Binary serialization doesn't include name information at all. However, it's probably possible for the 1.8.x Java implementation to create an Avro file with a "bad" schema, or store it in a registry, etc.

+1 to helping get this working for jackson!

RyanSkraba avatar Feb 19 '20 09:02 RyanSkraba

@Fokko I can confirm this is still an issue with Avro 1.9.2 and Jackson versions 2.9.8, 2.9.10, and 2.10.2

Sage-Pierce avatar Feb 19 '20 19:02 Sage-Pierce

@Sage-Pierce thank you for verifying this.

cowtowncoder avatar Feb 20 '20 21:02 cowtowncoder

  1. The schema can store other metadata. The java-class attribute is probably a good bet to store the Java class name!

It would be great! ... Except the reference implementation doesn't do this for record, enum, or fixed types. There's a lot of tooling and infra that depends heavily upon the reference implementation of Avro, so breaking compatibility with it is a dangerous game. IMHO the better solution would have been for the reference implementation to switch to preferring java-class if it exists, and always including java-class (even for record/enum/fixed types) whenever the java type isn't $namespace.$className. Honestly that'd probably be a good direction to take Jackson-Avro as it gives us more options on how to handle data correctly in a mixed-version ecosystem.

  1. There shouldn't be any problem with forward/backward compatibility strictly "over-the-wire" (I sincerely hope!) Binary serialization doesn't include name information at all. However, it's probably possible for the 1.8.x Java implementation to create an Avro file with a "bad" schema, or store it in a registry, etc.

The binary side of things are indeed fully compatible, but the binary can't be read without a schema, so I consider the schema part of compatibility. If a 1.8 client can't read 1.9 data written with a 1.9 schema and map it correctly, backwards compatibility was broken. Even though forward compatibility was not broken, there are some relatively simple/common usage patterns that make upgrading a herculean effort. (Essentially, all consumers have to be upgraded first, and only then you can upgrade the producers. But what do you do for an application that is both a consumer and producer? load different versions of the library on different classloaders?)

baharclerode avatar Feb 23 '20 09:02 baharclerode

Hmmh. I must say that I find possible incompatibility between 1.8 and 1.9 tooling deeply troubling, especially as binary format should itself be compatible. And even from basic version standpoint it is... not great.

While I do not necessarily understand full complexity here, it seems to me that it may be necessary to sort of work Jackson Avro format module into 2 implementations; existing older one for 1.8 and prior, and then something else for 1.9 and beyond? Naming of this thing is probably ... interesting problem, but both could co-exist and aside from question of what nominal "format name" (property that Jackson format implementations declare they support), might work. Obviously maintaining 2 separate backends is not ideal at all, but at least it would give users the possibility of selecting one that works better -- or, possibly, with enough work, ability to dynamically select one or the other?

I assume new module would be something like jackson-dataformat-avro19 / Avro19Module?

cowtowncoder avatar Feb 24 '20 00:02 cowtowncoder

Ultimately, I don’t think we need to maintain two different backends. Jackson is flexible enough that we can upgrade to avro 1.9, fix the deserialization/serialization code to handle 1.8 or 1.9 schemas automatically, and add a mapper feature to select if you want 1.8 or 1.9 schemas to be generated (defaulting to 1.8 for compatibility). Switching an existing ecosystem from 1.8 to 1.9 will still be painful, but Jackson won’t be the bottleneck/blocker.

I’m mostly just frustrated at the reference implementation because unlike Jackson, I can’t have full 1.8 and 1.9 support side-by-side in the same app without shading or classloader magic.

baharclerode avatar Feb 24 '20 08:02 baharclerode

To give some background, Avro 1.8.x was still on the old codehaus Jackson. Avro 1.8.2 was released back in June 2017, quite a while ago. Since then we've updated a lot of outdated dependencies, including moving from Jackson 1.x to 2.x and deprecating Joda-time and such. Since back then there we're some Jackson objects part of the public API, we've decided to remove these since we had to break the API anyway (sad, I know).

We've analyzed the API's that we've broke, this is mostly Jackson and Joda stuff: http://people.apache.org/~busbey/avro/1.9.0-RC4/1.8.2_to_1.9.0RC4_compat_report.html The Joda can still be enabled by setting an option, but we didn't want to ship Avro with the old version of Jackson anymore.

Fokko avatar Feb 24 '20 09:02 Fokko

@Fokko removal of Jackson 1.x (and especially removal of the exposure via Avro API) is great, and no complaints there. Replacements are (as far as I understand) fine and not problematic in themselves. But it seems that some changes outside this particular dependency are bigger issues

@baharclerode if that can be done, great. I wonder if this would fit in timeframe for 2.11 (quite imminent), or 2.12? Or, put another way: are there things to be done in 2.11, with limited compatibility problems, but that would lead towards breaking changes in 2.12? I would be happy to help coordinate preparation changes (there are PRs, I know).

cowtowncoder avatar Feb 26 '20 18:02 cowtowncoder

🍿 😮 🍿

Are there Avro JIRAs to file/watch for complaining about the "minor" semantic number introducing backwards incompatible changes?

OneCricketeer avatar Feb 27 '20 09:02 OneCricketeer

For info: @cricket007 https://issues.apache.org/jira/browse/AVRO-2687 and a link to the last discussion on the mailing list. Avro has not followed literal semantic versioning, which is often unexpected. Your (tactful and gentle) perspective would be very welcome :smile:

@baharclerode : I do agree with you about the schema being an important part for backwards/forwards compatibility. The current behaviour was definitely a fix for cross-platform compatibility in 1.9.x... it's unfortunate nobody foresaw code relying on the invalid names from 1.8.x!

I think I had a basic misunderstanding on what Jackson-Avro needed to work -- I'm taking a deeper look.

Avro is planning a 1.10.x release in May (which should not be considered a minor semantic version bump). It might be worthwhile working together and targetting that to have the least-painful / most-correct fixes in both projects -- maybe we can restore the "better solution" :point_up: behaviour for ReflectData at that point?

RyanSkraba avatar Feb 27 '20 10:02 RyanSkraba

@Fokko removal of Jackson 1.x (and especially removal of the exposure via Avro API) is great, and no complaints there. Replacements are (as far as I understand) fine and not problematic in themselves. But it seems that some changes outside this particular dependency are bigger issues

Which (other) changes do you refer to concretely @cowtowncoder ?

iemejia avatar Feb 27 '20 13:02 iemejia

@iemejia I wish I remembered the exact details, from my attempts to upgrade, but I think there were actual behavior changes between apache avro 1.8 and 1.9, so that although I was able to make Jackson Avro module compile with changes to use new methods in 1.9, existing test suite failures, possibly related to changes in schema name generation changes. I think @baharclerode is a better source for concrete details at this point. I did create branch exp/avro-1.9-for-2.10 for my upgrade attempt, so I can reproduce the test fails.

As to Avro 1.10.x, it sounds like this might be good opportunity to resolve the upgrade challenges. I am trying to finalize Jackson 2.11 soon; maybe we can do some of the fixes for that -- like at least remove use of avro lib's internal JacksonUtils. But even beyond that I hope to reduce time between 2.x minor versions so that anything fixed for 2.12 would not take as long as 2.10 did.

@baharclerode I'll have a look at that PR now.

cowtowncoder avatar Feb 29 '20 00:02 cowtowncoder

First patch merged, so now module does compile against avro lib 1.8.2 and 1.9.2, and in latter case does not pull in Jackson 1.x any more. There would be those interoperability test failures, but one problem at a time.

cowtowncoder avatar Feb 29 '20 03:02 cowtowncoder

Great to see this progressing. If we have further issues please don't hesitate to contact us, create JIRAs etc. As you definitely should know it is sometimes hard to track every use of our API but we are eager to not break our users or provide alternatives if so.

For future breakage awareness we are going to remove Joda Time in Avro 1.10 so please use the Java Time APIs if possible to ease future migration.

iemejia avatar Mar 03 '20 15:03 iemejia

maybe we can restore the "better solution" ☝️ behaviour for ReflectData at that point?

@RyanSkraba It's all a bit moot until my organization has a migration path off of 1.8, which will involve running a mixed ecosystem of 1.8.x and 1.9.x both as producers and consumers concurrently. Until we can figure out that, I'm effectively stuck on 1.8. We've got live distributed systems spanning multiple teams that we can't take down for atomic upgrades.

As you definitely should know it is sometimes hard to track every use of our API but we are eager to not break our users or provide alternatives if so.

@iemejia Yeah, I get that. And thanks for the callout on Joda; I don't think Jackson-Avro has any hard references to that, but I've not looked to see if the API was transitively exposed anywhere by virtue of us relying on type hints embedded in the schemas.

Which (other) changes do you refer to concretely @cowtowncoder ?

I believe @cowtowncoder is referring to the schema name -> class name resolution; These don't involve the Jackson 1.x dependency, but break 1.8.x implementations from using schemas generated by the 1.9 reference implementation when using reflection to map schema types to concrete types in scenarios involving nested classes.

Essentially, I would propose the following in the Avro reference implementation:

  1. We need a patch release for 1.8.x that allows it to handle 1.9.x schemas reflected from nested POJOs that were previously working with 1.8.x; This will probably involve patching SpecificData.getClass(Schema) to properly find nested classes when the schema both has $ and doesn't have $ in the namespace; as well as patching GenericData.resolveUnion(Schema,Object) to correctly resolve the union branch for instances of nested classes regardless of whether the schema has a trailing $ in the namespace or not.

  2. We need a patch release for 1.9.x that allows it to handle 1.8.x schemas reflected from nested POJOs; This will probably involve patching GenericData.resolveUnion(Schema,Object) to correctly resolve the union branch for instances of nested classes regardless of whether the schema has a trailing $ in the namespace or not.

  3. We need a patch release for 1.9.x that makes the use of java-class pervasive. Update ReflectData.createSchema(Type,Map) to always generate this property for records, enums, and fixed schemas whenever Class.forName(mangledNamespace + "." + mangledSimpleName) doesn't map to the concrete class; Update SpecificData.getClass(Schema) to utilize java-class even for maps, records, enums, and fixed schemas if it exists.

This will accomplish two things:

  1. 1 and 2 provide an easy upgrade path from 1.8.x to 1.9.x by more fully supporting a mixed ecosystem.
  2. 3 prepares to fix the remaining namespace allowed-characters bug in 1.10.x

In 1.10.x, we can blindly mangle the namespaces and record names however we want, since older 1.9.x clients will get the correct class name from the java-class property.

(We will need to make corresponding tweaks in Jackson-Avro, which is the second of the two PRs I described in my earlier comments in this thread, but have not written yet).

baharclerode avatar Mar 09 '20 06:03 baharclerode

@baharclerode If I understand above correctly, it sounds like...

  1. Jackson avro 2.11 does compile, run against apache avro 1.8 and 1.9, with fixes, but behavior varies wrt schema handling, failing some tests against 1.9. This allows some users to possibly upgrade to 1.9 if they do not need backwards compatibility; but this requires explicit override of version (and we do not want to increase default dep version yet due to compatibility problems)
  2. Further improvements to Jackson Avro format module now depend on changes to apache avro library, as this module uses it for most of schema handling.

If accurate, I think I can proceed with Jackson 2.11.0 at this point -- it is not delayed by Avro module, there have been other issues, but there is some urgency to getting some of improvements, fixes out -- and baseline increase wrt apache avro lib could go in 2.12.0 if issues can be resolved.

cowtowncoder avatar Mar 09 '20 16:03 cowtowncoder

I've run across the same vuln warnings that 1.8.2 has and noticed that avro 1.10.0 is released. Is it worthwhile to test this new minor versions to see if the issues are resolved?

conniey avatar Jul 06 '20 21:07 conniey

FWIW, @baharclerode the issue you brought up about nested class name deserialization breaking with Avro [1.9.0, 1.10.0) may have been fixed in ~1.10.0~ (actually, looks like it may not have gotten in to 1.10.0) by this PR

Sage-Pierce avatar Jul 06 '20 22:07 Sage-Pierce