Schema.NET
Schema.NET copied to clipboard
Support multiple `@type` values.
Describe the feature
This is a bit of a strange topic, but currently Google does not automatically connect the "Car" data type and the "Product" data type. Their recommendation is to use both:
Note: Currently Car is not supported automatically as a subtype of Product. So for now, you will need to include both Car and Product types if you would like to attach ratings to it and be eligible for the Search feature. For example in JSON-LD:
{ "@context": "https://schema.org", "@type": ["Product", "Car"] }
- Source: https://developers.google.com/search/docs/advanced/structured-data/product#product-properties
However, there is no option to specify the @type
as multiple types (that I can find) using this library. I did notice the AdditionalTypes
property, which is limited to only Uris and not strings like "Product". Theoretically, we could specify the Uri http://schema.org/Product
. This supposes that Google will actually support this property instead of the format that is outlined in their documentation.
I looked through the docs for JSON-LD, and it is in fact possible to specify an array for @type, so I think it should be supported out of the box in this library. If there is currently a way to do it, let this issue serve as a request for documentation.
https://json-ld.org/spec/latest/json-ld/#example-14-specifying-multiple-types-for-a-node
Schema objects
- https://schema.org/Thing
Thanks for the links. Its not currently supported today and I agree that we should add support. We'd accept a PR to that affect. I guess the Type property would become a string array, the custom serialization/deserialization logic would also need to add support for an array.
The difficulty is what happens on the consumer side. Perhaps we can use a Tuple
to represent objects of this type.
While it probably works well enough for the example of "Product" and "Car", this would have some really weird issues when it comes to types that don't inherit from each other. A JSON representation can represent two completely distinct types at once but in C# we could only represent types within the same hierarchy. To attempt to do so, the only way would be to find a common base type - for your example it might be "Product" (losing typed-access to "Car" properties) but for others it might be "Thing".
We could probably get away with it more for serialization than deserialization. For the former, we don't use the @type
property instead relying on the C# type. For the latter, we use it as the only way to find the C# type we're deserializing to.
There is also an adjacent issue to this due to how we handle deserialization to a Values<T1, ...>
type. We attempt to deserialize by the order of the types from right-to-left to find the right type. We'd potentially need to deserialize the same block of JSON into multiple types at once if one of the types isn't parent/grand parent type of the other.
For example with Values<IProduct, ICar>
, we can deserialize ["Product", "Car"]
to ICar
because it inherits IProduct
. If it were ["Product", "Organization"]
going to Values<IProduct, IOrganization>
, we'd need to deserialize it twice to capture all the data or otherwise identify the "primary" type we are deserializing to.
I imagine the @type being a OneOrMany, that change would have to be applied everywhere, but deserialization should still attempt to find a best common denominator of all the types (like it currently does), so a product, car could automatically map to Car since it contains all the subtypes. If it fails to find a common heir, it would be an exception. To opt in to a complex type like something that is a car AND an helicopter (flying car? Bad example) you would have to create a parent type inheriting from both to deserialize, and register it via DI.
I think the same parent type usage could be used to make serialization print multiple @type
s, just an empty class inheriting from Car with an overridden @type field.
I think we could add priority to the types and use dependency injection to register the assorted types in order, so a subclass can get a first chance to match a type with multiple - so it could deserialize back to multiple.
Just a thought, sorry for formatting I am on mobile.
Edit: lots of edits
I imagine the
@type
being a OneOrMany, that change would have to be applied everywhere...
Yeah, that's what it would need to be to make part of the serialization/deserialization work out-of-the-box in terms of the raw value.
... but deserialization should still attempt to find a best common denominator of all the types (like it currently does), so a product, car could automatically map to Car since it contains all the subtypes. If it fails to find a common heir, it would be an exception.
Yep, that's where the more complex logic will need to be. Effectively rewriting this section with a bunch of additional rules and checks.
To opt in to a complex type like something that is a car AND an helicopter (flying car? Bad example) you would have to create a parent type inheriting from both to deserialize, and register it via DI.
Maybe though currently the library doesn't have any hooks into DI to be aware of something like that. We do some assembly scanning stuff but the primary purpose of that is to find Car
from ICar
or whatever other interface is on the model.
We definitely could move that direction and have our SchemaSerializer
class be more configurable via creating instances of it rather than being a static class.
I think the same parent type usage could be used to make serialization print multiple
@type
s, just an empty class inheriting from Car with an overridden@type
field.I think we could add priority to the types and use dependency injection to register the assorted types in order, so a subclass can get a first chance to match a type with multiple - so it could deserialize back to multiple.
Not entirely sure what you mean here. Like I know generally what the words mean but how what I understand you saying would all work together etc.
Maybe to summarise my previous comments and thoughts:
Serialization
We'll basically be fine supporting serialization with OneOrMany<string>
of @type
with little to change in the most basic approach as we don't actually care about the property for our serialization logic.
Deserialization
We heavily lean on @type
to determine what our deserialization target is, especially because it can be of a more specific type than the property we are deserializing to (eg. deserializing a Car
into a Values<IProduct>
). When we don't or can't determine a C# type target, we basically try deserializing to each type in a Values<T, ... >
until we have one that doesn't throw an exception.
With multiple @type
values, we'd need to check the hierarchy of each against each other. So if it was Car
, Product
and Organization
then this could be an example hierarchy:
- Car -> Product -> Thing
- Product -> Thing
- Organization -> Thing
I guess first logic is whether they all share a common type at all. We can see it fairly clearly here that they all share Thing
so that is great. Now, we'd need to check that this common type is allowed to be deserialized to our current property (eg. Values<string, int, IThing>
). If the common type can't, we'd need to throw an exception. If it can, we could proceed with deserialization but only to the common type so we'd lose any data that isn't captured in this base type.
Now my example here is intentionally nonsense - it isn't going to be a Car/Product/Organization. The truth behind the example is that this problem could exist in any of the more deeply nested types like places or locations etc.
With your specific example, we have the potential to avoid the common type/data loss problem because Car inherits Product. So in addition to finding a common type, if the types represent a linear hierarchy (rather than something that branches to various sub types), we could find the most derived type and deserialize to that.
I'm not sure how best to approach this type of check though as comparing all types against all other types, comparing hierarchies seems like one of those obscure/complex programming problems.