jsonb-api icon indicating copy to clipboard operation
jsonb-api copied to clipboard

Unclear when to include polymorphic information when runtimeType for serialization is thrown away

Open asvanberg opened this issue 2 years ago • 11 comments

At the top-level of Jsonb you can specify a runtimeType when serializing but that information is lost as you go down the object graph via SerializationContext method serialize(...).

With the upcoming addition of polymorphic types this will become an issues. What is expected to happen in the following scenarios;

@JsonbPolymorphicType(
  key = "@race",
   value = {
    @JsonbSubtype(alias = "cat", type = Cat.class),
    @JsonbSubtype(alias = "dog", type = Dog.class)
  }
)
interface Animal {}

record Cat(String color) implements Animal
record Dog(String name) implements Animal
record Person(Dog bestFriend)

Dog dog = new Dog("Fred")
List<Animal> animals = List.of(new Cat("red"), dog)
Person p = new Person(dog)

// What is the expected output of the following
jsonb.toJson(animals) // scenario 1
jsonb.toJson(dog) // scenario 2
jsonb.toJson(dog, Animal.class) // scenario 3
jsonb.toJson(p) // scenario 4
  1. Everyone would probably agree on [{"@race": "cat", "color": "red"}, {"@race": "dog", "name": "Fred"}]
  2. Should polymorphic information be included here? We're serializing a dog, not an animal.
  3. Polymorphic information should most likely be included.
  4. In the final scenario it is not even possible to make the distinction because in the serializer for Person we have to call the SerializationContext#serialize method to serialize the dog which only takes an object and we can't even specify the runtimeType which could be based on, for example, the return type of the method that gave us the object.

asvanberg avatar Jan 09 '22 17:01 asvanberg

Hi @asvanberg ,

For advanced serialization you have to use a custom adapter or (de)serializer (=codec), by default the runtime has the needed information so I'm not sure where this information can actually miss by itself but we can likely specify that a serializer prevents JsonbPolymorphicType usage (or superseeds more exactly). The other note is that the runtimeType we find in jsonb is never actually required to be used so an be considered as not there as of today for serialization by spec AFAIK.

rmannibucau avatar Jan 09 '22 17:01 rmannibucau

What do you mean when you say "advanced serialization"? I'm not talking about writing or using any user-provided serializers, only default behavior. What is the expected output of scenario 4? Dog is not a polymorphic type but if the type of the object is the only thing we have to go by then either we always scan the hierarchy to see if it is part of one and include it, or not.

If the Jsonb methods for serialization with a Type parameter should be ignored, how about deprecating them like @JsonbProperty#nillable?

asvanberg avatar Jan 09 '22 17:01 asvanberg

@asvanberg to answer more precisely:

  1. yes
  2. yes
  3. same as 2
  4. as 1 in bestFriend - object - attribute of the outcoming JSON (person is a wrapping object there)

About runtimeType it is more there for a future use I think but we didn't get a real usage/need yet so, even if I understand the idea of deprecation today, i'm not sure we should right now (maybe a backlog task for 3.0?).

rmannibucau avatar Jan 09 '22 18:01 rmannibucau

Then I would say this is the perfect time to make use of that runtimeType parameter (and also extend the SerializationContext interface to include it). That seems like a better path forward than forcing polymorphic type to always be included even when serializing a known subtype (such as scenario 4 above). By having the the possibility to specify the runtimeType when calling serialize on the SerializationContext you can specify that yes, I do want to serialize as the polymorphic Animal type.

asvanberg avatar Jan 09 '22 19:01 asvanberg

@asvanberg not sure i'm following you, runtimeType is unrelated to polymorphism feature as well as SerializationContext. Do you want to drop polymorphism annotation and replace it with a (de)serializer codec using the runtime type?

rmannibucau avatar Jan 09 '22 20:01 rmannibucau

No, I do not want to drop the polymorphism annotations. What I'm suggesting is to only include polymorphic information if you're explicitly serializing a type annotated with @JsonbPolymorphicType with the explicit type being the runtimeType provided to Jsonb which would also require methods taking such a parameter to be added to the SerializationContext interface as well.

You said that polymorphic type should always be included even when it is unnecessary (such as scenario 4 above). This feels like the wrong approach to take and it should only be included when you're trying to serialize an object as a polymorphic type. With "as a polymorphic type" being the equivalent of passing the polymorphic type as the runtimeType parameter to the existing Jsonb methods or the (to be added) serialize​(T object, JsonGenerator generator, Type runtimeType) method of SerializationContext.

This would change the scenarios above to;

  1. Yes. The parameterized type List<Animal> would have its first type argument be an explicitly annotated polymorphic type
  2. No. The Dog class is not a explicitly annotated polymorphic type
  3. Yes. It is explicitly requested to serialize the Dog object as the annotated polymorphic type Animal
  4. No (see 2). However if the type of the record component was change to Animal then yes, it would be included

asvanberg avatar Jan 09 '22 20:01 asvanberg

What if dog is declared as an Animal like:

Animal dog = new Dog("Fred")
jsonb.toJson(dog);
  1. No. The Dog class is not a explicitly annotated polymorphic type

Excluding the polymorphic type to me implies that the "reader" knows that it is a dog (the json parser that will read the json content MUST already knows its a Dog type of Animal)?

Edit: more explicit wording

rbygrave avatar Jan 09 '22 20:01 rbygrave

The JSON-B runtime can not know what type your variable is so that information is not available at all.

The consumer of the generated JSON would presumably read the documentation you provided to them and that should state that you're sending back dogs and not any type of animal. Therefore they should be fully aware that is always a dog and can then read how those are serialized.

asvanberg avatar Jan 09 '22 20:01 asvanberg

@asvanberg it does assumes a lot about the code and it is not generally true. To caricature a bit you can return Dog and it should still work if you return a BullDog so not sure we can do it without a deactivation flag which would make it more complicated for a poor gain IMHO, personally I would use a DogModel copy to handle that case since the type attribute does not hurt at all.

rmannibucau avatar Jan 10 '22 08:01 rmannibucau

What assumptions are you talking about that are not true?

I do not follow what you mean with "still work if you return a BullDog"? If you mean the Dog attribute is now an instance of BullDog? If so that should still be serialized as a Dog since that is the return type of the method and would contain no polymorphic information since Dog does not include a polymorphism annotation.

You are correct that including polymorphism information in the JSON does not hurt. However it does confuse the consumer since they would think they could get the whole hierarchy when that is not the case.

Forcing runtimes to always include it mean they have to scan the entire class hierarchy to find which ones they are a part of and potentially error out if there are multiple and they are not linear. If you have to be explicit about which hierarchy you want to serialize the object as a part of then that error condition goes away and there is never a need to scan class hierarchies. It would simplify implementing runtimes.

I feel like having no way to opt-out of polymorphic information other than duplicating the classes without the annotations is a bad path to take and I would rather see a solution where you have to specify which hierarchy to serialize if it is not given by the object type (which it would be in the case of a List<Animal> for example).

asvanberg avatar Jan 10 '22 12:01 asvanberg

@asvanberg as soon as you get a polymorphic hierarchy, all subclasses must keep the discriminator to let the right class be loaded. You mentionned "yes but if I know I have a dog" case, I just mention that you can have, in this exact case, a Bulldog extends Dog and still need the polymorphic data. So "However it does confuse the consumer since they would think they could get the whole hierarchy when that is not the case." is stricly true but in practise it is what the consumer will need since you don't get the full tree but a subtree in practise.

Forcing runtimes to always include it mean they have to scan the entire class hierarchy to find which ones they are a part of and potentially error out if there are multiple and they are not linear. If you have to be explicit about which hierarchy you want to serialize the object as a part of then that error condition goes away and there is never a need to scan class hierarchies. It would simplify implementing runtimes.

It is the case anyway whatever you do (you browse the full hierarchy to have attributes) so in terms of perf or complexity for vendors it is 1-1. I would also add - as a side note ;) - we care less of vendors than users at spec level so I don't think it is a point we should take into consideration until it implies some real implementation limitation - classpath scanning would be to give an example but it does not apply there.

I feel like having no way to opt-out of polymorphic information other than duplicating the classes without the annotations is a bad path to take and I would rather see a solution where you have to specify which hierarchy to serialize if it is not given by the object type (which it would be in the case of a List<Animal> for example).

To be honest the polymorphism as stated is just to give a default which works most of the time but - summarizing in 2 lines days of discussion ;) - the fact is you need in all other cases to implement you own codec because there is nothing like "polymorphism" in JSON and there are as much needs as application so we'll never support them all through this feature (@JsonbPolymorphic) since it would imply making this particular feature as complex as the full JSON-B spec.

rmannibucau avatar Jan 10 '22 14:01 rmannibucau