concerto
concerto copied to clipboard
Concerto should allow $class to be omitted or abbreviated when it's obvious from context
The need to provide a full $class declaration on every subobject in an object tree adds a lot of redundant information to a serialized Concerto object. In nearly all cases it could be removed or abbreviated without loss of information. This has a few benefits:
- Less redundant and more aesthetically pleasing data
- Closer to the "just the data, please" ideal
- Easier to programmatically manage changes to the version of the data, and less likely to end up in Frankenstein scenarios where different parts of an object have different model versions, if the version is only declared in one place
Feature Request
The $class property should not be required where it can be inferred from context, and an abbreviated form of the $class property should be allowed when it can be mostly inferred from context. The only times a full $class property is really needed are 1) on the top-level object, and 2) when referencing a type from another namespace or version. Otherwise, the $class property could usually be omitted entirely, even in some polymorphic cases. In the polymorphic cases where a $class property is needed, it's rare to need a full $class; usually the type name only would suffice, without the namespace and version, which can be inferred from the containing object.
If there is only one $class attribute, at the top level, it can easily be altered to change the version. If there's a $class attribute on every subobject, the user would need to recursively update all of them. Done programmatically, this requires significantly more complicated logic, and if they aren't so careful to update all the $class attributes, then they can easily get objects where some field values have version X and other field values have version Y. The object could no longer be said to come from a single model version, but would be a mix of multiple model versions. Perhaps in rare cases that might be desirable, but it should be something that's hard to do by accident.
I'll note that this request is about deserializing and validating objects. It would also be nice if Concerto would produce a more compact serialized format as well, but that is outside the scope of this request. (#482 is somewhat related, describing a "final" feature to allow Concerto to omit $class in one specific case. I notice that the title of that issue was changed earlier today to something broader, but the body is still about the "final" feature.)
Possible Solution
- A full $class is required at the top level. Based on this, you know the concept of the top-level object and the declared type of every field in that concept.
- In each data field (or each element if the field is an array), if the $class is omitted, it's assumed to be the same as the declared field type. (If the declared field type is abstract, that's an error.) If $class is just a type name, the namespace and version are assumed to be the same as the declared field type.
- The logic applies recursively.
I'll note that this is backwards-compatible with the existing, more verbose specification that requires a full $class property on every sub-object.
I've implemented an initial solution in #577.
The changes to the spec above are:
- Extension: I allow for basic polymorphism where there is a single concrete subtype. This allows us to handle some uses of abstract concepts.
- Omission: I haven't implemented an abbreviated form of the $class discriminator. I'd like to consider this separately in the context of #480.
Summary of discussion on Working Group call, 12 December 2022. https://vimeo.com/783301280
The solution as implemented in #577 causes the model manager context to affect the validity of an instance. For example:
Suppose a model M.cto:
namespace M
concept C {
o String x
}
The instance would validate against this model (given a root type of M.C
).
{ "x": "X" }
Now suppose that an additional model is added to the model manager:
namespace N
import M.C
concept D extends C {}
When attempting to validate the original instance with both models loaded, there is no single concrete subtype to choose.
This proves that adding a Concept (even in a separate namespace) can be a breaking change with respect to validation and deserialization. This is not something that we should support.
We propose to defer this feature until we also support final
annotations. This would allow us to unambiguously infer the types for only those with a final
annotation. #482.
Why not just treat objects as instances of the declared field type if $class is omitted? It's much simpler than a final
annotation and more general.
Your example doesn't match my understanding, since I believe a $class is still required on a top-level instance, but I understand what you mean at least. But given two namespaces similar to yours...
namespace M
concept Top {
o B o
}
concept B {
o String s
}
namespace N
import M.*
concept D extends B { }
... then this instance is unambiguous:
{ "$class": "M.Top", "o": { "s": "string" } }
In this case, o
must be an instance of M.B
, because the field is declared as being of type M.B
. It is not and cannot be N.D
. That would require a $class:
{ "$class": "M.Top", "o": { "$class": "N.D", "s": "string" } }
This is related to https://github.com/accordproject/concerto/issues/482 and should be tackled in the context of strict: true
This is related to #482 and should be tackled in the context of
strict: true
I'm not sure what you mean about strict: true
, but if you're implying that $class should be allowed to be omitted only when strict
is false, I'd disagree. Strictness presumably means forcing adherence to the rules; I'm arguing for a changing of the rules (for everyone) to relax them in cases where it doesn't introduce any ambiguity. I don't know which rules get ignored when strict
is false, but I doubt I'd want to have to opt into that general leniency to get this feature.
I'm not sure what you mean about
strict: true
.
In short, "strict-mode" means that versioning is mandatory. This removes ambiguity which makes this kind of type inference easier.
Strict-mode will become the default in a future release.
This looks like a good candidate for #706.
One other observation is that when we omit $class
from non-root types we make it harder to copy/paste (or programatically move) JSON nodes around, because the root node is "special" and has different rules from the non-root nodes (which are contextualised by their root). This is not a problem per-se, but is something we should bear in mind. We may want to consider adding helper functions to contextualise / decontextualise nodes.
We should also specify whether $class is permitted but optional everywhere except root (e.g. for backwards compatibility).