Schema.NET Support multiple `@type` values.

Describe the feature

This is a bit of a strange topic, but currently Google does not automatically connect the "Car" data type and the "Product" data type. Their recommendation is to use both:

Note: Currently Car is not supported automatically as a subtype of Product. So for now, you will need to include both Car and Product types if you would like to attach ratings to it and be eligible for the Search feature. For example in JSON-LD:
{
   "@context": "https://schema.org",
   "@type": ["Product", "Car"]
}
- Source: https://developers.google.com/search/docs/advanced/structured-data/product#product-properties

However, there is no option to specify the @type as multiple types (that I can find) using this library. I did notice the AdditionalTypes property, which is limited to only Uris and not strings like "Product". Theoretically, we could specify the Uri http://schema.org/Product. This supposes that Google will actually support this property instead of the format that is outlined in their documentation.

I looked through the docs for JSON-LD, and it is in fact possible to specify an array for @type, so I think it should be supported out of the box in this library. If there is currently a way to do it, let this issue serve as a request for documentation.

https://json-ld.org/spec/latest/json-ld/#example-14-specifying-multiple-types-for-a-node

Schema objects

https://schema.org/Thing

Jul 18 '22 18:07 johnrom

Thanks for the links. Its not currently supported today and I agree that we should add support. We'd accept a PR to that affect. I guess the Type property would become a string array, the custom serialization/deserialization logic would also need to add support for an array.

The difficulty is what happens on the consumer side. Perhaps we can use a Tuple to represent objects of this type.

Jul 19 '22 09:07 RehanSaeed

While it probably works well enough for the example of "Product" and "Car", this would have some really weird issues when it comes to types that don't inherit from each other. A JSON representation can represent two completely distinct types at once but in C# we could only represent types within the same hierarchy. To attempt to do so, the only way would be to find a common base type - for your example it might be "Product" (losing typed-access to "Car" properties) but for others it might be "Thing".

We could probably get away with it more for serialization than deserialization. For the former, we don't use the @type property instead relying on the C# type. For the latter, we use it as the only way to find the C# type we're deserializing to.

Jul 23 '22 06:07 Turnerj

There is also an adjacent issue to this due to how we handle deserialization to a Values<T1, ...> type. We attempt to deserialize by the order of the types from right-to-left to find the right type. We'd potentially need to deserialize the same block of JSON into multiple types at once if one of the types isn't parent/grand parent type of the other.

For example with Values<IProduct, ICar>, we can deserialize ["Product", "Car"] to ICar because it inherits IProduct. If it were ["Product", "Organization"] going to Values<IProduct, IOrganization>, we'd need to deserialize it twice to capture all the data or otherwise identify the "primary" type we are deserializing to.

Jul 23 '22 06:07 Turnerj

I imagine the @type being a OneOrMany, that change would have to be applied everywhere, but deserialization should still attempt to find a best common denominator of all the types (like it currently does), so a product, car could automatically map to Car since it contains all the subtypes. If it fails to find a common heir, it would be an exception. To opt in to a complex type like something that is a car AND an helicopter (flying car? Bad example) you would have to create a parent type inheriting from both to deserialize, and register it via DI.

I think the same parent type usage could be used to make serialization print multiple @types, just an empty class inheriting from Car with an overridden @type field.

I think we could add priority to the types and use dependency injection to register the assorted types in order, so a subclass can get a first chance to match a type with multiple - so it could deserialize back to multiple.

Just a thought, sorry for formatting I am on mobile.

Edit: lots of edits

Jul 23 '22 18:07 johnrom

I imagine the @type being a OneOrMany, that change would have to be applied everywhere...

Yeah, that's what it would need to be to make part of the serialization/deserialization work out-of-the-box in terms of the raw value.

... but deserialization should still attempt to find a best common denominator of all the types (like it currently does), so a product, car could automatically map to Car since it contains all the subtypes. If it fails to find a common heir, it would be an exception.

Yep, that's where the more complex logic will need to be. Effectively rewriting this section with a bunch of additional rules and checks.

To opt in to a complex type like something that is a car AND an helicopter (flying car? Bad example) you would have to create a parent type inheriting from both to deserialize, and register it via DI.

Maybe though currently the library doesn't have any hooks into DI to be aware of something like that. We do some assembly scanning stuff but the primary purpose of that is to find Car from ICar or whatever other interface is on the model.

We definitely could move that direction and have our SchemaSerializer class be more configurable via creating instances of it rather than being a static class.

I think the same parent type usage could be used to make serialization print multiple @types, just an empty class inheriting from Car with an overridden @type field.

I think we could add priority to the types and use dependency injection to register the assorted types in order, so a subclass can get a first chance to match a type with multiple - so it could deserialize back to multiple.

Not entirely sure what you mean here. Like I know generally what the words mean but how what I understand you saying would all work together etc.

Maybe to summarise my previous comments and thoughts:

Serialization We'll basically be fine supporting serialization with OneOrMany<string> of @type with little to change in the most basic approach as we don't actually care about the property for our serialization logic.

Deserialization We heavily lean on @type to determine what our deserialization target is, especially because it can be of a more specific type than the property we are deserializing to (eg. deserializing a Car into a Values<IProduct>). When we don't or can't determine a C# type target, we basically try deserializing to each type in a Values<T, ... > until we have one that doesn't throw an exception.

With multiple @type values, we'd need to check the hierarchy of each against each other. So if it was Car, Product and Organization then this could be an example hierarchy:

Car -> Product -> Thing
Product -> Thing
Organization -> Thing

I guess first logic is whether they all share a common type at all. We can see it fairly clearly here that they all share Thing so that is great. Now, we'd need to check that this common type is allowed to be deserialized to our current property (eg. Values<string, int, IThing>). If the common type can't, we'd need to throw an exception. If it can, we could proceed with deserialization but only to the common type so we'd lose any data that isn't captured in this base type.

Now my example here is intentionally nonsense - it isn't going to be a Car/Product/Organization. The truth behind the example is that this problem could exist in any of the more deeply nested types like places or locations etc.

With your specific example, we have the potential to avoid the common type/data loss problem because Car inherits Product. So in addition to finding a common type, if the types represent a linear hierarchy (rather than something that branches to various sub types), we could find the most derived type and deserialize to that.

I'm not sure how best to approach this type of check though as comparing all types against all other types, comparing hierarchies seems like one of those obscure/complex programming problems.

Jul 24 '22 06:07 Turnerj

Schema.NET Schema.NET copied to clipboard

Support multiple `@type` values.

Describe the feature

Schema objects

Schema.NET
Schema.NET copied to clipboard