message-format-wg
message-format-wg copied to clipboard
[FEEDBACK] Possible simplification of the data model
I am currently working on implementing the mf2 spec in Python and I'm trying to understand why FunctionExpression
and UnsupportedExpression
are two separate entities in the data model. Could we simply combine them similar to LiteralExpression
and VariableExpression
?
This is what I mean:
interface FunctionExpression {
type: "expression";
arg?: never;
- annotation: FunctionAnnotation;
+ annotation: FunctionAnnotation | UnsupportedAnnotation;
attributes: Attribute[];
}
- interface UnsupportedExpression {
- type: "expression";
- arg?: never;
- annotation: UnsupportedAnnotation;
- attributes: Attribute[];
- }
This should not introduce any ambiguity because you can always tell which expression you're working with based on the annotation type.
No, implementing this in type safe languages would be a nightmare if the same types needed to be able to hold valid usable data and and invalid/partially invalid data. It makes the outer type nearly useless and you have to implement a ton of code just to figure out what you have and handle different cases vs. just having the right type up front.
Agreed, but from my understanding that is currently the case with VariableExpression
and LiteralExpression
anyway. You need to inspect the annotation to know if you're dealing with an unsupported expression. Only with FunctionExpression
can you tell immediately by the type itself.
Good catch. In which case I'd argue those should be reworked to match this one, not the other way around. :wink:
Good catch. In which case I'd argue those should be reworked to match this one, not the other way around. 😉
Indeed, that would be better :) In that case, I'd propose something like this:
interface LiteralExpression {
type: "expression";
arg: Literal;
- annotation?: FunctionAnnotation | UnsupportedAnnotation;
+ annotation?: FunctionAnnotation;
attributes: Attribute[];
}
interface VariableExpression {
type: "expression";
arg: VariableRef;
- annotation?: FunctionAnnotation | UnsupportedAnnotation;
+ annotation?: FunctionAnnotation;
attributes: Attribute[];
}
interface FunctionExpression {
type: "expression";
arg?: never;
annotation: FunctionAnnotation;
attributes: Attribute[];
}
interface UnsupportedExpression {
type: "expression";
- arg?: never;
+ arg?: Literal | VariableRef;
annotation: UnsupportedAnnotation;
attributes: Attribute[];
}
I am currently working on implementing the mf2 spec in Python and I'm trying to understand why
FunctionExpression
andUnsupportedExpression
are two separate entities in the data model. Could we simply combine them similar toLiteralExpression
andVariableExpression
?
Sure, we could, but that wouldn't really change anything. Note that these are TypeScript interface definitions, so a value matching the current definition would also match your proposed alternative.
There are two main reasons why Expression
is split up the way it is:
- We need a
VariableExpression
definition, because it's used inInputDeclaration
. - We want to represent the requirement of having at least
arg
orannotation
be non-empty in the data model.
So when implementing your internal data model, you may want to see if you can drop those requirements, which allows for a single, simpler Expression
:
interface Expression {
type: "expression";
arg?: Literal | VariableRef;
annotation?: FunctionAnnotation | UnsupportedAnnotation;
attributes: Attribute[];
}
To use that, you'll need to separately verify that either arg
or annotation
is present, and you'll need to collapse the declarations into a single definition:
interface Declaration {
name: string;
value: Expression;
}
This is all possible because the TS representation of the data model is not really intended to support interchange between systems; that's what the JSON Schema and DTD definitions are for.
For an example Python datamodel that applies the above simplifications, see message.py in the moz.l10n package that I'm currently working on.
I think this can be closed due to removing reserved/unsupported syntax.