jackson-databind icon indicating copy to clipboard operation
jackson-databind copied to clipboard

Polymorphic subtype deduction based on 'presence' of fields does not appear to take 'absence' of fields into account

Open lderienzo opened this issue 3 years ago • 19 comments

Describe the bug Our use case involves the following: We are calling a web service that returns a JSON response that is a composite object. The object has a member field which is itself an object whos field members change depending on the service called, therefore we implemented this object as an interface with a number of different implementations corresponding to each different service, so it is a "polymorphic" member of the response object. Therefore, we placed the @JsonTypeInfo(use = DEDUCTION) annotation a top this interface definition.

In one particular service call, this response object returns with an empty object for this polymorphic member. When this case occurs, an error similar to the following is encountered(say we have only two implementations defined in the @JsonSubTypes annotation) :

com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Cannot deduce unique subtype of `com.engageft.jackson.deserialization.issue.objects.AnInterface` (2 candidates match)
 at [Source: (String)"{
   "compositeClassIntMember":11235,
   "compositeClassStringMember":"compositeClassString",
   "polyMorphicMember":{
      
   }
}"; line: 6, column: 4] (through reference chain: com.engageft.jackson.deserialization.issue.objects.CompositeClass["polyMorphicMember"])
	at com.fasterxml.jackson.databind.jsontype.impl.AsDeductionTypeDeserializer.deserializeTypedFromObject(AsDeductionTypeDeserializer.java:131)

We then thought we'd try creating an implementation of this polymorphic member without any fields specifically for this case of an empty response object in the hopes that this mechanism would find this "empty implementation" and use it, but this did not work.

After examining the code found here, it appears that the case, or situation of if/when a particular implementation contains no data members is not being accounted for.

Other than this, we find this functionality absolutely fantastic and immensely helpful! Thank you for adding it! We hope that you might find a way to account for and solve this issue. I realize this is probably an outlier of a use case, and so is probably one that is very easily overlooked.

If you think you may benefit from any further assistance from me on this matter please do not hesitate to let me know :) .

Version information Which Jackson version(s) was this for? jackson-databind-2.12.0

To Reproduce I've created a small project which demonstrates this issue and it's located here.

Expected behavior The tests in the test project should outline the whole thing, but if they don't please let me know.

Additional context Can't think of any at the moment, but if I do, I'll amend this.

lderienzo avatar Dec 10 '20 03:12 lderienzo

True, implementation only considers properties found. It might be possible to figure out a fallback case, if (but only if) such could be uniquely determined. What could help here would be a set of example definitions along with JSON to outline expected case.

Another possibility is that existing @JsonTypeInfo(... , defaultImpl = DefaultType.class) could maybe be used (I would think that'd be fallback anyway).

cowtowncoder avatar Dec 10 '20 05:12 cowtowncoder

Thanks @cowtowncoder for your prompt reply.

I'm not sure if you caught it or not, but I did create a small sample project with a unit test demonstrating the problem. It's mentioned under the "To Reproduce" section.

Also, I actually forgot to mention that I did try the @JsonTypeInfo(... , defaultImpl = DefaultType.class) option but it didn't work, so now I'm wondering if perhaps I didn't do this correctly somehow. I will go back over it and try again. I'll add this to my sample project.

Thanks again much. Cheers, -Luke

lderienzo avatar Dec 10 '20 17:12 lderienzo

I added @JsonTypeInfo(... , defaultImpl = DefaultType.class) to the sample project but couldn't get it to work as you suggested trying. I've pushed those changes to my sample project repo for you to look at.

I'm now actually wondering if there is a problem with defaultImpl when used with DEDUCTION. I found what might be a similar issue here.

lderienzo avatar Dec 10 '20 18:12 lderienzo

Here is a JUnit 5 unit test for this feature based on a Stack Overflow question:

import com.fasterxml.jackson.annotation.JsonSubTypes;
import com.fasterxml.jackson.annotation.JsonTypeInfo;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.json.JsonMapper;
import org.junit.jupiter.api.Test;

import java.util.List;

import static org.junit.jupiter.api.Assertions.assertEquals;

public class DeductionBasedPolymorphismDefaultTest {
    @Test
    public static void main(String[] args) throws JsonProcessingException {
        String json = "{\"animals\": [{\"name\": \"Sparky\"}, {\"name\": \"Polly\", \"wingspan\": 5.25}]}";
        ZooPen zooPen = new JsonMapper().readValue(json, ZooPen.class); // Currently throws InvalidTypeIdException

        assertEquals(Animal.class, zooPen.animals.get(0).getClass());
        assertEquals(Bird.class, zooPen.animals.get(1).getClass());
    }

    public static class ZooPen {
        public List<Animal> animals;
    }

    @JsonTypeInfo(use = JsonTypeInfo.Id.DEDUCTION, defaultImpl = Animal.class)
    @JsonSubTypes({@JsonSubTypes.Type(value = Bird.class)})
    public static class Animal {
        public String name;
    }

    public static class Bird extends Animal {
        public double wingspan;
    }
}

mjustin avatar Feb 12 '21 07:02 mjustin

@mjustin @cowtowncoder , thank you guys for taking the time to look into this and provide that sample Junit code demonstrating how to use this feature.

On a slightly different note, but still related to this same functionality, I think I might have stumbled across something I was wondering if you could also take a look at when you had a chance.

I started encountering the following exception: com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Cannot deduce unique subtype of com.engageft.services.carta.integrations.domain.response.serviceresponse.ServiceResponseData (2 candidates match)

So I decided to look into the code with the debugger, and here's what I discovered:

In the AsDeductionTypeDeserializer class it appears that after narrowing down candidates in the prune method based on the presence of certain properties, once this method returns for the last time to the deserializeTypedFromObject method from where it's being called, it appears that the logic it's using to determine whether it has the proper deserialization candidate or not might be off.

So it goes through looking for whatever properties are present in the json, BUT it looks for those properties only. So if it finds those same properties present in more than one candidate, it lists both candidates. However, it appears to fail to look to see if there are any remaining properties present in any of the candidates that would/could once again distinguish them. That is... It's not considering the Bit "set" of ALL properties of each candidate when it does its search and comparison for a candidate. It's not taking the collection/combination of properties to be unique and then eliminating/selecting candidates based on the collection of properties.

please see attached screen shot for further clarification: jackson-polymorhic-deduction-scrnshot

It seems to me that the code would need to first collect all the properties present in the JSON, form a BitSet out of these, and then compare the BitSet obtained from the JSON with the BitSets of each candidate.

Anyway, these are just some initial observations upon a brief inspection of things. It's possible I may have missed or am not completely understanding something, but in case I'm not, I thought I'd just mention this to you guys for whatever it may be worth.

Thanks again a ton for any and all attention you have already given and may give to any of this! And thanks again for all your hard work creating such a great library!

Cheers!

Very best, -Luke

P.S. If any of this explanation does not make sense please let me know as I would be more than happy to clarify. Also, if you need any additional information at all just say the word and I'll get it to you. Let's just say, if you need anything at all that would help you to look into this, understand this, troubleshoot this, recreate this etc., let me know. Thanks!

lderienzo avatar Feb 13 '21 03:02 lderienzo

I must admit that since this feature came as a contribution, I am not as intimately familiar with it as many other parts of the codebase, but it makes sense. I'll cc @drekbour (author of this feature) who is better qualified to comment.

cowtowncoder avatar Feb 13 '21 04:02 cowtowncoder

Two things are reported above: 1 Lack of support for deduction fallback to defaultImpl. This feels like a simple bug so I've raised #3305 and will PR soon. 2 Discussion on deduction via absence of fields. This is "a feature not a bug". It was discussed during the implementation and skipped for reasons of complexity both to implement and to support. Here is a statement of today's support:

The current (2.10) deduction process does not support pojo-hierarchies such that the absence of child properties infers a parent type. That is, every deducible subtype MUST have some unique properties and the input data MUST contain said unique properties to provide a positive match.

Without this, the decision process would be much more opaque. Imagine the following:

@JsonInclude(Include.NON_NULL) // null fields are omitted
@JsonTypeInfo(use = JsonTypeInfo.Id.DEDUCTION)

If a wingspan property is not present, is it the Animal class which doesn't have such a property, the Bird class that simply has an elided null value or is it perhaps the BigBird class that had a problem during serialisation and omitted this value. There are thousands of such combinations and no way for the deser code to discern the senders intent.

Given support for defaultImpl to provide generic fallback, I don't "like" this feature unless we can state a compelling "no-surprises" strategy that won't confuse.

[aside: Perhaps the statement of support (and why) is not clear enough, I'll include additional javadoc clarification in the #3305 PR]

drekbour avatar Feb 13 '21 12:02 drekbour

@drekbour makes sense -- I was to (but forgot) point out that reliance on intermediate parent types seems bit iffy to me as well, from modelling perspective.

cowtowncoder avatar Feb 13 '21 20:02 cowtowncoder

@cowtowncoder , @drekbour Any thoughts or observations on this other item? The one I'm trying to point out in the attached image?

The item being the way in which the act of deduction appears to be being performed...

...by comparing only the properties present in the JSON being deserialized to those in the prospective candidates, and then ruling a "match" if all the properties are present, but then not checking if only those properties are present.

That is, not checking if a potential candidate has additional properties which would make might make it unique and remove it from candidacy?

That is, basing the designation of "candidacy" on property-by-property comparison of present properties vs. the comparison of "sets" of properties as a whole.

To my understanding, what makes a potential "candidate" is an implementation which contains exactly the properties contained in the JSON object being deserialized.

My apologies in advance if I'm somehow understanding the way this implementation is supposed to work, and/or you already addressed this somehow in a previous reply that I may have somehow missed :(.

Many thanks as always. -Luke

lderienzo avatar Feb 16 '21 20:02 lderienzo

I agree in general that the code is not doing a total match but that's because it can't. Supported properties can be missing from the JSON (without that being a problem) while others may be ignored. Many Jackson annotations/features also affect what goes on. Generic code cannot know the intentions of the user so it does the best it can by eliminating those candidates that appear impossible then passing its best guess to the full deserialisation process.

The key is line 111 in the screenshot (tip: Download Sources to get author's comments!) which means the only JSON properties that are checked are ones that are supported by at least one subtype. That means they can be used to eliminate others if present.

drekbour avatar Feb 16 '21 22:02 drekbour

I think this is implied already, but my understanding is that the deduction here relies on "discriminators", values that are distinct to a type; and that these should be relatively few in number. Idea is not to try to exhaustively list all possible properties (that would be fragile and negative benefits of the approach); and like @drekbour said, Jackson does not really make it even possible to definitely figure out all properties of everything there might be in type hierarchy. And there are also "any properties" to consider (@JsonAnyGetter / @JsonAnySetter), as well as metadata like possible Object Id (Type Id not being used here).

cowtowncoder avatar Feb 17 '21 23:02 cowtowncoder

Leaving this here, in case anyone finds inspiration to further improve handling, but after improvements to defaultImpl handling (see #3055) there are no specific plans for improvement on this.

cowtowncoder avatar Feb 19 '21 01:02 cowtowncoder

Thank you guys for explaining this. I think I'm seeing and understanding what you're saying in the comments addressing the screenshot.

If I'm understanding things correctly, it seems that what this boils down to is the the fact that Javascript/JSON is a "loosely typed language" and Java is a "strictly typed language" so JSON is lacking the "certainty" with regards to type structures that Java requires.

If this code represented a romantic relationship between two people, Java would be the type of person that "needs and demands commitment", while the javascript would be the "free spirit" in the relationship, which means it would probably never work! lol But, since this is NOT a relationship between two people, and is instead a relationship between two computer programming languages, there might actually be some "hope" of it someday working. :)

This would definitely be a challenging coding task to improve! If the inspiration shows up for me at some point in the form of time and energy I may just take a whack at it! :)

Thank you guys again for all the time and attention you have given to this issue.

Cheers, -Luke

lderienzo avatar Feb 25 '21 01:02 lderienzo

Thank you @lderienzo, that is an interesting way to describe it -- and Java (and thereby Jackson) definitely are more along more-static typing continuum.

This does not mean that more dynamic approaches still wouldn't be nice to have: earlier I was thinking that a mechanism that would pre-bind value into JsonNode and pass that to TypeDeserializer, and then have some configurable way to define extraction rule for type id, would give enough power for customized but still possibly convenient handling.

Some challenges come from backwards compatibility. But if sacrificing little bit of type safety, perhaps JsonTypeInfo and/or @JsonTypeResolver could be bent to allow such handler(s).

But I am sure there are many other approaches that could work, if you (or anyone else) has time to spend on figuring something out.

cowtowncoder avatar Feb 25 '21 02:02 cowtowncoder

Hey @cowtowncoder :)

I've decided I'd like to find some time somewhere soon to take a look at the code and see if maybe I can't contribute something of value to the project since it has contributed so much to me over the years!

I've cloned the jackson-databind repo and will be pulling it down soon at some point and giving it all a good look-over. I'm not going to lie, sometimes there is a level of complexity that I do struggle to understand things at bit, so it can be a little intimidating. I'm hoping I can reach out to you and others for any clarification where and when needed. :)

I think I'll start by looking into your above dynamic implementation idea to both, get a better feel and understanding for the code, and obviously see how your idea fits into it at present.

I look forward to diving in soon!

Thank you guys again! Cheers! -Luke

lderienzo avatar Feb 26 '21 17:02 lderienzo

@lderienzo Sure, that sounds great! I am not going to pretend this wasn't a rather complicated area, and with pieces that were not designed for extensibility. At the same time, I haven't spent too much time trying to figure out how to extend it so it is possible there might be new ways of thinking about it. So good luck and let me (and others) know if we can help!

cowtowncoder avatar Feb 26 '21 18:02 cowtowncoder

@cowtowncoder thank you, and I absolutely will!

The first bit of help that I probably could use would be to know what the best way of communicating with you guys would be as I'm going to guess that it probably wouldn't be the best idea to do it in this ticket! Do you have a slack channel or something?

Also, is there a "point person" or "persons" that I should be reaching out to specifically when I have questions? Thanks much again!

lderienzo avatar Mar 01 '21 03:03 lderienzo

While not optimal, tickets are fine, although I'd probably create a separate new one for implementation discussion (either here, or jackson-future-ideas). There is the gitter chat:

https://gitter.im/FasterXML/jackson-databind

which I try to check every now and then; others sometimes too.

As to point persons, I am probably the only consistently active developer wrt databind (some other modules have active maintainers, esp. Scala/Kotlin), but discussions on tickets can attract others via github notifications on comments.

cowtowncoder avatar Mar 01 '21 18:03 cowtowncoder

Gotcha. Awesome, thank you @cowtowncoder :)

lderienzo avatar Mar 01 '21 19:03 lderienzo