concerto icon indicating copy to clipboard operation
concerto copied to clipboard

Concerto should allow $class to be omitted or abbreviated when it's obvious from context

Open DS-AdamMilazzo opened this issue 2 years ago • 7 comments

The need to provide a full $class declaration on every subobject in an object tree adds a lot of redundant information to a serialized Concerto object. In nearly all cases it could be removed or abbreviated without loss of information. This has a few benefits:

  • Less redundant and more aesthetically pleasing data
  • Closer to the "just the data, please" ideal
  • Easier to programmatically manage changes to the version of the data, and less likely to end up in Frankenstein scenarios where different parts of an object have different model versions, if the version is only declared in one place

Feature Request

The $class property should not be required where it can be inferred from context, and an abbreviated form of the $class property should be allowed when it can be mostly inferred from context. The only times a full $class property is really needed are 1) on the top-level object, and 2) when referencing a type from another namespace or version. Otherwise, the $class property could usually be omitted entirely, even in some polymorphic cases. In the polymorphic cases where a $class property is needed, it's rare to need a full $class; usually the type name only would suffice, without the namespace and version, which can be inferred from the containing object.

If there is only one $class attribute, at the top level, it can easily be altered to change the version. If there's a $class attribute on every subobject, the user would need to recursively update all of them. Done programmatically, this requires significantly more complicated logic, and if they aren't so careful to update all the $class attributes, then they can easily get objects where some field values have version X and other field values have version Y. The object could no longer be said to come from a single model version, but would be a mix of multiple model versions. Perhaps in rare cases that might be desirable, but it should be something that's hard to do by accident.

I'll note that this request is about deserializing and validating objects. It would also be nice if Concerto would produce a more compact serialized format as well, but that is outside the scope of this request. (#482 is somewhat related, describing a "final" feature to allow Concerto to omit $class in one specific case. I notice that the title of that issue was changed earlier today to something broader, but the body is still about the "final" feature.)

Possible Solution

  1. A full $class is required at the top level. Based on this, you know the concept of the top-level object and the declared type of every field in that concept.
  2. In each data field (or each element if the field is an array), if the $class is omitted, it's assumed to be the same as the declared field type. (If the declared field type is abstract, that's an error.) If $class is just a type name, the namespace and version are assumed to be the same as the declared field type.
  3. The logic applies recursively.

I'll note that this is backwards-compatible with the existing, more verbose specification that requires a full $class property on every sub-object.

DS-AdamMilazzo avatar Nov 02 '22 16:11 DS-AdamMilazzo

I've implemented an initial solution in #577.

The changes to the spec above are:

  • Extension: I allow for basic polymorphism where there is a single concrete subtype. This allows us to handle some uses of abstract concepts.
  • Omission: I haven't implemented an abbreviated form of the $class discriminator. I'd like to consider this separately in the context of #480.

mttrbrts avatar Dec 16 '22 13:12 mttrbrts

Summary of discussion on Working Group call, 12 December 2022. https://vimeo.com/783301280

The solution as implemented in #577 causes the model manager context to affect the validity of an instance. For example:

Suppose a model M.cto:

namespace M

concept C {
  o String x
}

The instance would validate against this model (given a root type of M.C).

{ "x": "X" }

Now suppose that an additional model is added to the model manager:

namespace N

import M.C

concept D extends C {}

When attempting to validate the original instance with both models loaded, there is no single concrete subtype to choose.

This proves that adding a Concept (even in a separate namespace) can be a breaking change with respect to validation and deserialization. This is not something that we should support.

We propose to defer this feature until we also support final annotations. This would allow us to unambiguously infer the types for only those with a final annotation. #482.

mttrbrts avatar Dec 21 '22 20:12 mttrbrts

Why not just treat objects as instances of the declared field type if $class is omitted? It's much simpler than a final annotation and more general.

Your example doesn't match my understanding, since I believe a $class is still required on a top-level instance, but I understand what you mean at least. But given two namespaces similar to yours...

namespace M

concept Top {
  o B o
}

concept B {
  o String s
}
namespace N
import M.*
concept D extends B { }

... then this instance is unambiguous:

{ "$class": "M.Top", "o": { "s": "string" } }

In this case, o must be an instance of M.B, because the field is declared as being of type M.B. It is not and cannot be N.D. That would require a $class:

{ "$class": "M.Top", "o": { "$class": "N.D", "s": "string" } }

DS-AdamMilazzo avatar Dec 22 '22 00:12 DS-AdamMilazzo

This is related to https://github.com/accordproject/concerto/issues/482 and should be tackled in the context of strict: true

dselman avatar Aug 16 '23 12:08 dselman

This is related to #482 and should be tackled in the context of strict: true

I'm not sure what you mean about strict: true, but if you're implying that $class should be allowed to be omitted only when strict is false, I'd disagree. Strictness presumably means forcing adherence to the rules; I'm arguing for a changing of the rules (for everyone) to relax them in cases where it doesn't introduce any ambiguity. I don't know which rules get ignored when strict is false, but I doubt I'd want to have to opt into that general leniency to get this feature.

DS-AdamMilazzo avatar Aug 16 '23 15:08 DS-AdamMilazzo

I'm not sure what you mean about strict: true.

In short, "strict-mode" means that versioning is mandatory. This removes ambiguity which makes this kind of type inference easier.

Strict-mode will become the default in a future release.

mttrbrts avatar Aug 17 '23 11:08 mttrbrts

This looks like a good candidate for #706.

One other observation is that when we omit $class from non-root types we make it harder to copy/paste (or programatically move) JSON nodes around, because the root node is "special" and has different rules from the non-root nodes (which are contextualised by their root). This is not a problem per-se, but is something we should bear in mind. We may want to consider adding helper functions to contextualise / decontextualise nodes.

We should also specify whether $class is permitted but optional everywhere except root (e.g. for backwards compatibility).

dselman avatar Sep 08 '23 12:09 dselman