concerto
concerto copied to clipboard
Optimize JSON (De)serialization (Final Types et al)
Feature Request 🛍️
Use Case
Allow modellers to declare that types are "final" and cannot be extended. This allows for optimisation of the serialised JSON, as an array of final types can no longer be polymorphic.
The $class is required in serialised instances because concepts can be extended. We could support the addition of a final keyword which would indicate that a concept could not be extended, meaning that the type of an object can be inferred by introspecting the model.
Possible Solution
namespace acme
final concept Person {
o String name
}
final concept Company {
o String name
o Person[] employees
}
const company =
{
"name" : "Clause",
"employees" : [
{"name" : "Dan"}, {"name", "Simon"}
]
}
// deserialize, passing in the name of the type explicitly
// during deserialization the model is introspected to know that company.employee is an Employee type (and cannot be a subclass).
serialiser.fromJSON( company, ''acme.Company' );
Context
Detailed Description
Note that if $class
is missing we could assume that we need to do a type lookup - if the type found is abstract then we fail with an error.
I would suggest that the "final" feature is not needed for this, and you could get the desired savings in more situations via a more general mechanism. Since objects are naturally processed and validated recursively, you should already know the parent's type when you encounter a field value. From the parent's type definition, you already know the declared type of the field. $class could simply be omitted wherever it matches the declared field type. Also, when only the type name within the $class differs from the declared field type, the namespace and version could be omitted, leaving an abbreviated $class field containing only the type name. For example:
abstract concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }
concept Thing {
o Owner owner
}
The $class in the owner could simply be "HumanOwner" or "CorporateOwner". The namespace and version would be assumed to match that of the Owner type in that case.
I opened #542 to represent this broader mechanism, as this issue was still specific to the "final" feature last I looked.
declared
Note that type determination based on the type of the property only works if you assume that the model doesn't change between when the instance was serialised and when it was deserialised. For example, if version 1 of the model is:
concept Owner { ... }
concept Thing {
o Owner owner
}
It is assumed that no $class is required, because the owner
property is of an (non-abstract) type Owner
.
Version 2 of the model is:
concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }
concept Thing {
o Owner owner
}
A $class is now required as a type discriminator to deserialise the Thing.owner
property due to the potential usage of HumanOwner and CorporateOwner
The advantage of the final
keyword (or sometimes called sealed
) is that it makes it much clearer to the modeller than making a final
type non-final is a breaking change - and that change occurs in the namespace of the type that is being declared "unfinal". Without final
keyword any introduction of a derived type, in any namespace, is potentially a breaking change for deserialisation.
Reading #542 I note your suggestion that if $class
is not present we assume that the instance if of the type of the property of the model used during deserialisation (not of the model used during serialiazation). That sounds like an interesting optimisation but does have interesting model evolution implications!
Note that type determination based on the type of the property only works if you assume that the model doesn't change between when the instance was serialised and when it was deserialised.
I don't think that's true. It should work regardless. The $class of the top-level object - which is always required - declares the exact version of the top-level object that you're dealing with. If it's v1 then you know you're in the first case. If it's v2 then you know you're in the second case. There's no ambiguity or assumptions about the model not changing.
(Furthermore, I don't see why a $class would be required even in your v2 model, at least according to my proposed rule. Given that Owner is not abstract, a $class-less owner would still be legal and unambiguous. It would only become required if Owner became abstract.)
As for your final comment, isn't the whole system based on the assumption that specific model versions are immutable? A specific model version [email protected] should be identical during serialization and deserialization. If people are mutating models without changing the version then all kinds of things will break.
I assume there is some fundamental misunderstanding here. Can you post an example model and data that is ambiguous or interesting according to the rules in #542 ?
It seems that your understanding of my proposal is that we allow $class to be omitted when and only when there's exactly one concrete type that a field could possibly be, so that adding more possible types suddenly makes $class required where it wasn't before. Something like this:
abstract concept A { }
concept C {
o A a // $class can be omitted as long as there's only one concrete subtype of A in existence anywhere
}
I agree that would be a bad idea, but it's not my proposal. My proposal is much simpler and more generally applicable. It allows $class to be omitted whenever a field type is not abstract, and when omitted, it is understood to be the exact concrete type the field is declared as. To revisit your v2 example above:
concept Owner { ... }
concept HumanOwner extends Owner { ... }
concept CorporateOwner extends Owner { ... }
concept Thing {
o Owner owner
}
- Can $class be omitted for the "owner" field? Yes, because the field type, Owner, is not abstract.
- What type is inferred when $class is omitted? It is only the exact field type, [email protected], and none other! If you want to use HumanOwner or CorporateOwner, or even to use a different version of Owner (if one exists), then you need a $class.
I don't believe there is any form of model evolution for which this rule causes problems, assuming you're not mutating existing models without changing their versions, which can break all kinds of things. But if you have an example, I'd be glad to see it.
As for making such changes (to whether $class is required) obvious to the modeler, in my example the modeler would have to add or remove the abstract
keyword - a breaking change - just like in your proposal they'd have to add or remove the final
keyword (or, in either case, change the field type - another breaking change). There are no invisible/external or nonbreaking changes that can change the interpretation of the data. (In fact, since the models are versioned and immutable, there is no way to change the interpretation of the data, period.)