jackson-dataformats-binary icon indicating copy to clipboard operation
jackson-dataformats-binary copied to clipboard

Avro 1.12.0 breaks jackson-dataformat-avro (also: requires JDK 11)

Open rdifrango opened this issue 1 year ago • 32 comments

We were attempting to upgrade our project to Avro 1.12.0 with jackson-dataformat-avro version 2.17.2

The failure I get is as follows:

Exception in thread "main" java.lang.NoSuchMethodError: 'org.apache.avro.Schema$Parser org.apache.avro.Schema$Parser.setValidate(boolean)'
	at com.fasterxml.jackson.dataformat.avro.AvroMapper.schemaFrom(AvroMapper.java:240)
	at com.fmr.difrango.ztester.DemoJacksonAvroFailure.main(DemoJacksonAvroFailure.java:23)

To demonstrate this, I created the following example program that uses issue19.avsc as the schema to load.

package com.difrango.ztester;

import com.fasterxml.jackson.dataformat.avro.AvroMapper;
import com.fasterxml.jackson.datatype.jdk8.Jdk8Module;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.IOException;

public class DemoJacksonAvroFailure {
    private static final Logger LOGGER = LoggerFactory.getLogger(DemoJacksonAvroFailure.class);

    public static void main(String[] args) {
        var expectedStream =
                DemoJacksonAvroFailure.class.getClassLoader().getResourceAsStream("issue19.avsc");

        var mapper = new AvroMapper();
        mapper.registerModule(new Jdk8Module());
        mapper.registerModule(new JavaTimeModule());

        try {
            var expected = mapper.schemaFrom(expectedStream);
        } catch (IOException e) {
            LOGGER.error("Error Handling the Schema", e);
        }
    }
}

rdifrango avatar Aug 30 '24 12:08 rdifrango

Yes, see #167 -- you cannot upgrade Jackson 2.17 and below to Avro library past 1.8.2.

But upcoming 2.18.0 should allow upgrade.

cowtowncoder avatar Aug 30 '24 16:08 cowtowncoder

Thanks @cowtowncoder. It's interesting that we've been successfully using AVRO 1.11.2 so it must be some edge cases that we aren't currently hitting.

rdifrango avatar Aug 30 '24 16:08 rdifrango

@rdifrango Ah. I forgot that the issue wrt upgrade were mostly in unit tests, not on binary compatibility between module and codec library.

But I hope 2.18.0-rc1 -- being released today! -- will work with 1.12.x and above.

cowtowncoder avatar Aug 30 '24 20:08 cowtowncoder

2.18.0-rc1 now released: @rdifrango not sure if it's easy to do, but it'd be great to see if upgrade would work with 2.18, for eventual 2.18.0 release

cowtowncoder avatar Sep 02 '24 00:09 cowtowncoder

2.18.0-rc1 now released: @rdifrango not sure if it's easy to do, but it'd be great to see if upgrade would work with 2.18, for eventual 2.18.0 release

I just tested it and I am seeing the same issue.

I just looked at AvroMapper and noticed that here it still has new Schema.Parser().setValidate(true) and it's the .setValidate(true) that was removed in 1.12.0

rdifrango avatar Sep 03 '24 15:09 rdifrango

@rdifrango rats. Ok. That may also explain why upgrade was to 1.11.3 not 1.12.0.

cowtowncoder avatar Sep 03 '24 16:09 cowtowncoder

@rdifrango rats. Ok. That may also explain why upgrade was to 1.11.3 not 1.12.0.

Quite possibly...and I'll grant that the move to 1.12.0 is not backwards compatible.

rdifrango avatar Sep 03 '24 17:09 rdifrango

It seems odd for Avro library to nominally follow SemVer but... seemingly not? (although as a library maintainer I understand how difficult backwards-compatibility is).

But would be good to know why the method was removed & if it had been at least deprecated earlier. Plus whether there might be replacement to use, from older versions too.

cowtowncoder avatar Sep 03 '24 17:09 cowtowncoder

It seems odd for Avro library to nominally follow SemVer but... seemingly not? (although as a library maintainer I understand how difficult backwards-compatibility is).

But would be good to know why the method was removed & if it had been at least deprecated earlier. Plus whether there might be replacement to use, from older versions too.

I'd agree @cowtowncoder especially since I'm having a conversation with on this PR with the commons-validator team and they are being very strict.

I did look over on the avro side, but couldn't find any specific mention of why this was changed and it was unclear to me if it's now simply the default behavior of the parser or not.

Should we ask over there as well?

rdifrango avatar Sep 04 '24 18:09 rdifrango

@rdifrango I guess damage is done, API compatibility broken b/w versions. So not sure how much value there would be. (not sure how commons-validator is related -- is that a dependency they added/removed?)

cowtowncoder avatar Sep 04 '24 18:09 cowtowncoder

@rdifrango I guess damage is done, API compatibility broken b/w versions. So not sure how much value there would be. (not sure how commons-validator is related -- is that a dependency they added/removed?)

agreed the damage is done, the commons-validator was more an indicator that other apache projects seem to hold to strict guidelines around binary and API compatibility which clearly Avro has not.

rdifrango avatar Sep 04 '24 18:09 rdifrango

One other reason why we probably cannot upgrade: looks like Avro 1.12.0 is built with JDK 11, and Jackson 2.18 still has Java 8 as the baseline.

cowtowncoder avatar Sep 05 '24 03:09 cowtowncoder

One other reason why we probably cannot upgrade: looks like Avro 1.12.0 is built with JDK 11, and Jackson 2.18 still has Java 8 as the baseline.

That definitely makes this tricky, any thoughts on a path forward?

My concern is that if some sort of CVE is discovered in the AVRO project there currently isn't a path forward.

rdifrango avatar Sep 05 '24 15:09 rdifrango

@rdifrango No idea short-term. Things are tricky as it's multi-module Maven project so would need to at least build all with JDK 11; but then try to target JDK 8 (if even possible) for non-Avro backends.

But on very short term I think baseline code needs to remain 1.11.x.

Ideally we would let users override with newer version tho.

And I think it'd actually be possible to use matrix build for CI testing to try version override, if we get module itself run against newer library version -- but that's not solved yet; how to avoid/change parser.setValidate() call.

cowtowncoder avatar Sep 05 '24 17:09 cowtowncoder

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

cowtowncoder avatar Dec 27 '24 01:12 cowtowncoder

Ok, so Avro lib backwards-compatibility is really... not. There isn't any. 1.11.3 -> 1.11.4 breaks tests, as per #539. Patches can be as breaking as minor version or so.

cowtowncoder avatar Dec 27 '24 01:12 cowtowncoder

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

I'll try to take this on but it won't be until mid-January.

rdifrango avatar Dec 29 '24 16:12 rdifrango

I think upgrading master/3.0 first to 1.12.x makes most sense.

Patch 1.11.4 is something users can opt-in regardless of what Jackson 2.x module defaults to (that is, for versions that default to 1.11.3, 2.18.x and 2.19)

cowtowncoder avatar Dec 31 '24 21:12 cowtowncoder

One quick note: Jackson 3.0 (from master branch) will have JDK 17 (at least) as baseline, so dependency could be increased there without problems. While that won't be released until maybe 2025/march (first Release Candidate), it'd be useful upgrade, if anyone could create a PR.

I'll try to take this on but it won't be until mid-January.

ok, I started looking at this and it seems to break a few of the tests. The two groups in particular Jackson to Apache with Jackson schema and Apache to Apache with Jackson schema.

In the one test ListWithPrimitiveWrapperTest.testListWithShorts it's complaining about java.lang.ClassCastException: class java.lang.Short cannot be cast to class java.lang.Integer (java.lang.Short and java.lang.Integer are in module java.base of loader 'bootstrap')

I'm just trying to understand these tests before moving forward.

rdifrango avatar Jan 08 '25 20:01 rdifrango

and related, in ListWithPrimitiveWrapperTest.testListWithDoubles I'm getting the following in those same 2 tests:

org.junit.ComparisonFailure: Expected :[1.0, 0.0, -1.0, 4.9E-324, 1.7976931348623157E308] Actual :[1.0, 0.0, -1.0, 0.0, Infinity]

rdifrango avatar Jan 09 '25 17:01 rdifrango

@rdifrango Latter def sounds like Float used for Double (or vice versa), triggering over-/underflow.

cowtowncoder avatar Jan 09 '25 17:01 cowtowncoder

@rdifrango Latter def sounds like Float used for Double (or vice versa), triggering over-/underflow.

I'm seeing on the reading side that it's hitting the Double section of org.apache.avro.generic.GenericDatumReader, aka this:

case DOUBLE:
      return in.readDouble();

so I'm now looking at ObjectWriter to see how the bytes look and compare it to the other ones that do pass when using doubles.

rdifrango avatar Jan 09 '25 18:01 rdifrango

To your point, with Apache to Apache with Jackson schema or Jackson to Apache with Jackson schema the following do work fine:

  • Float
  • Integers
  • Strings
  • Longs

The one that fails with different values is:

  • Double

The ones not working at all are:

  • Shorts java.lang.ClassCastException: class java.lang.Short cannot be cast to class java.lang.Integer (java.lang.Short and java.lang.Integer are in module java.base of loader 'bootstrap')

  • Characters java.lang.ClassCastException: class java.lang.Character cannot be cast to class java.lang.Integer (java.lang.Character and java.lang.Integer are in module java.base of loader 'bootstrap')

  • Bytes - java.lang.ClassCastException: class java.lang.Byte cannot be cast to class java.lang.Integer (java.lang.Byte and java.lang.Integer are in module java.base of loader 'bootstrap')

rdifrango avatar Jan 09 '25 19:01 rdifrango

Sounds like some level of casting would be needed, somewhere. But not quite sure how/why breakage; I guess Avro Codec started using more accurate Java types, maybe, for its binding (or more generic?).

I know above is not super helpful, just thinking out aloud I guess.

cowtowncoder avatar Jan 09 '25 19:01 cowtowncoder

Sounds like some level of casting would be needed, somewhere. But not quite sure how/why breakage; I guess Avro Codec started using more accurate Java types, maybe, for its binding (or more generic?).

I know above is not super helpful, just thinking out aloud I guess.

Well, for the double case, I at least tracked it down to here and it's close to what you thought, they're converting the Double to a floatValue and when it's called it returns either 0 or Infinity which feels like a bug in Avro to me:

https://github.com/apache/avro/blob/main/lang/java/avro/src/main/java/org/apache/avro/generic/PrimitivesArrays.java#L549-L554

rdifrango avatar Jan 09 '25 19:01 rdifrango

@rdifrango Thank you for tracking that down, yes, symptoms fit the problem. float range not big enough to represent MIN/MAX double, obviously, so gets "rounded"/truncated to 0 / Inf.

Could you file a bug (and maybe even PR) against Avro-lib, that definitely looks like a straight-forward bug, possibly copy-paste error?

This would solve just one problem case, I suppose, but ... one step at a time.

cowtowncoder avatar Jan 09 '25 23:01 cowtowncoder

https://issues.apache.org/jira/browse/AVRO-4110

rdifrango avatar Jan 10 '25 15:01 rdifrango

Thank you @rdifrango !

cowtowncoder avatar Jan 10 '25 21:01 cowtowncoder

Thank you @rdifrango !

No worries and I'm working on the PR to fix that.

I also think the following change is what caused this and is the one causing the failures with Short, Character, and Byte test as well:

https://github.com/apache/avro/pull/2389, specifically this comment.

rdifrango avatar Jan 10 '25 22:01 rdifrango

@cowtowncoder, the first PR is in:

https://github.com/apache/avro/pull/3292

Now to prove my theory on the PR that may have caused the other failures.

rdifrango avatar Jan 16 '25 14:01 rdifrango