jackson-databind icon indicating copy to clipboard operation
jackson-databind copied to clipboard

v2.13.2.1 fix breaks subclassing of UntypedObjectDeserializer.Vanilla.deserialize

Open ben-spiller opened this issue 2 years ago • 7 comments

Describe the bug Upgrading from 2.13.2 to 2.13.2.1 has broken our application, despite there being no warning of any compatibility-breaking changes in the release notes (https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.13.1).

We have a custom UntypedObjectDeserializer.Vanilla subclass that overrides the public API method deserialize(JsonParser jp, DeserializationContext ctxt). Unfortunately the fix for GH-2816 has created a new overload for this method with an additional "depth" argument, meaning the original (documented API signature) is no longer called in several cases which broke our custom logic resulting in incorrect output (and given the new overload is private not protected, there's no way to work around this).

Arguably subclassing UntypedObjectDeserializer.Vanilla for use with module.addDeserializer may not be a good idea - indeed we've worked around the jackson change by subclassing UntypedObjectDeserializer instead of .Vanilla (pretty easy, once we tracked down the cause of the problem). However UntypedObjectDeserializer.Vanilla is public API, not marked as "final" and has no note on the javadoc to tell people not to use it in this way so it's an easy trap to fall into with the current docs, and there was no warning about this in the fix notes - worth fixing.

Version information 2.13.2.1

To Reproduce Unfortunately I haven't had time to put together a standalone repro case, but thought you'd like to be aware of the issue nonetheless.

Expected behavior Please update https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.13.1 asap to say that this security update may be a breaking change for anyone who has subclassed UntypedObjectDeserializer.Vanilla and overridden deserialize(). Users do not expect any risk of breakage from a x.x.x fix so calling this out (although unlikely) is worthwhile, and hopefully minimal effort to add this note.

You might also consider making UntypedObjectDeserializer.Vanilla a "private" or "final" class if you don't want it to be subclassed and used in this way (or at least adding a doc note), which would avoid any future risks when this class is edited.

ben-spiller avatar Apr 22 '22 14:04 ben-spiller

@ben-spiller Thank you for reporting this issue: it is indeed unfortunate. I wish I had paid closer attention to the contribution to consider backwards-compatibility aspect in more detail. While there has never been any intent to promote sub-classing of standard (de)serializers, it is not entirely unreasonable to do so so (with the usual caveats). I'll have to consider if Vanilla variant should be made final and perhaps private as well. As to no notes on breakage, this is simply due to us not realizing there is a problem. And to be completely honest, it is very difficult to predict actual breakages compared to theoretically possible ones. Still, had this not been fix for a CVE deemed critical (which I disagree with but that's another story), change would never have been made in a patch; it would have gone in the next minor release and possibly uncovered during alpha/beta release stages.

Aside from adding warning(s) for this, would you be willing to help see if we could make things compatible for 2.13.3? (2.14.0 will not be using same implementation so it should not have this particular issue; no depth argument passed... then again, no implementation yet exists)? That, and ideally also a simplified unit test based on sub-classing -- to guard against regression -- would be great to have. I could add these methods myself but I have found that the use of existing code (that in this case suffered from incompatibility) for tests usually works best.

EDIT: actually, re-reading the description, I am not sure if it is possible to change things to work with overloads. I guess it really depends on details of how sub-classing is done. I would still be interested in figuring out a way to possibly alleviate the issue with 2.13.3.

cowtowncoder avatar Apr 24 '22 01:04 cowtowncoder

I added a note on https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.13 next to 2.13.2.1 micro-patch.

Note that

https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.13.1

is not a good place (I think?) to add a note since breakage is in 2.13.2.1 while follows 2.13.2.

cowtowncoder avatar Apr 24 '22 01:04 cowtowncoder

Thanks for the quick update!

I think that subclassing UntypedObjectDeserializer is still ok, just subclassing UntypedObjectDeserializer.Vanilla is the problem?

I can see two ways to make this a safe way:

  1. Pass the extra argument via a back-channel (e.g. setting a field before calling the main deserialize method) rather than an overload, since the original overload needs to be called in all situations to keep compatibility. That's certainly a bit gross, but probably the best we can do given existing API.
  2. Revert revert all changes to the existing Vanilla class (+ maybe deprecate it?), and create a copy of the Vanilla class (perhaps with private or package visibility since not sure it's meant to be public API?) which you can apply the security fix to, then change all places that were calling/using the public Vanilla class to use the new one (effectively changing the original Vanilla into dead code that's only used for this subclassing use case)

Hope that's useful. Much as I'd love to help I'm unfortunately not in a position to be able to offer much time to contribute testing etc right now (got some big things both in and outside work taking up all my cycles at the moment :) )

ben-spiller avatar Apr 26 '22 09:04 ben-spiller

@ben-spiller Ok thank you. I appreciate your response and understand that there's often limited time to contribute to OSS projects (my situation is not very different in that sense :) ).

I guess I was hoping there might be easier thing to change (some simple oversight), but looking at code and thinking it through it sounds like probably there is on easy way to make changes to allow sub-classing to work, without gnarly work arounds (like trying to pass depth off-band or such). If it was possible to just, say, share the code you have, that could help -- but there are probably problems in sharing corporate code openly so I understand sharing would probably require non-trivial cleansing etc. etc. which is not practical at this point.

Given this I think the best course of action for 2.14 is to add notes to Javadoc warning against sub-classing of Vanilla variant (and perhaps disclaimer on UntypedObjectDeserializer itself -- it is not designed as a base class for custom deserializers and there no real guarantees it won't break -- but at least it is less likely to break than Vanilla).

One thing that you could consider -- if you have time in near future -- could be to contribute a test for ensuring sub-classing of UOD with maybe simplified logic. If there is a regression test I can much more easily reduce change of things breaking. To put it bluntly, things that are tested are unlikely to break and things not covered by tests may well break due to various refactorings. Despite not having any intent to break existing usage. But sub-classing tends to be quite brittle because there's almost infinite variety of ways in which one can change behavior that way. :)

cowtowncoder avatar Apr 26 '22 23:04 cowtowncoder

OK fair enough.

perhaps disclaimer on UntypedObjectDeserializer itself -- it is not designed as a base class for custom deserializers and there no real guarantees it won't break -- but at least it is less likely to break than Vanilla

This surprised me. So which class are people meant to subclass if they want to use the default deserialization logic but with some customizations?

ben-spiller avatar Apr 27 '22 10:04 ben-spiller

@ben-spiller Generally classes with prefix of Std are designed as base classes (Std[De]Serializer), as well as various base (de)serializers for Collections and Maps: those are sub-classed by datatype modules. There is no well-defined division, although I think I started moving deserializer implementations into packages so that:

  • src/main/java/com/fasterxml/jackson/databind/deser/std -- public, sub-classable
  • src/main/java/com/fasterxml/jackson/databind/deser/impl -- implementation details

Still, there is no clear guideline or design; intent varies by specific class.

Now that I think about this, too: for me sub-classing, if used, would be more for creating handlers for new/different types and not about changing existing behavior. For changing behavior I'd prefer explicit configurability -- but then again, defining that is more work, perhaps, than defining methods that may be sub-classed.

"Untyped" object deserializer/serializer are kind of unique even beyond that; much more complex than typical (de)serializers.

cowtowncoder avatar Apr 27 '22 14:04 cowtowncoder

Ah. Having set all of above, realized that UntypedObjectDeserializer is under .../deser/std. So by my own logic it should be "more public" in a way. I think it makes sense, then, for main class to try to retain its API (when possible), and try to close up Vanilla variant.

cowtowncoder avatar Apr 29 '22 00:04 cowtowncoder