Inconsistent validation behavior on cross-reference alternatives
For alternatives of cross-references, the validator accepts some grammars even though they won't work at runtime. Other grammars get rejected even though I assume they would work at runtime.
Our scenario:
- The reference target can be ProcedureTypeA or ProcedureTypeB
- References can be simple names or fully qualified names
Attempt 1:
ProcedureCall:
procedure=([ProcedureTypeA:SIMPLE_NAME] | [ProcedureTypeA:FULLY_QUALIFIED_NAME] | [ProcedureTypeB:SIMPLE_NAME] | [ProcedureTypeB:FULLY_QUALIFIED_NAME])
This is rejected by the validator: "Mixing a cross-reference with other types is not supported. Consider splitting property "procedure" into two or more different properties." I think it's okay that the validator rejects it with that error message.
Attempt 2:
ProcedureCall:
(procedure=[ProcedureTypeA:SIMPLE_NAME] | procedure=[ProcedureTypeA:FULLY_QUALIFIED_NAME] | procedure=[ProcedureTypeB:SIMPLE_NAME] | procedure=[ProcedureTypeB:FULLY_QUALIFIED_NAME])
This is accepted by the validator, but it doesn't work at runtime. I think it is unsupported by Langium. See this unreachable code in getReferenceType() of ast.ts:
switch (referenceId) {
case 'ProcedureCall:procedure': {
return ProcedureTypeA ;
}
case 'ProcedureCall:procedure': {
return ProcedureTypeB;
}
Attempt 3:
type ProcedureCallTarget= ProcedureTypeA | ProcedureTypeB;
ProcedureCall:
(procedure=[ProcedureCallTarget:SIMPLE_NAME] | procedure=[ProcedureCallTarget:FULLY_QUALIFIED_NAME])
This is accepted by the validator and seems to work at runtime, too.
Attempt 4: Trying to write the above in a somewhat more compact form:
type ProcedureCallTarget = ProcedureTypeA | ProcedureTypeB;
ProcedureCall:
(procedure=([ProcedureCallTarget:SIMPLE_NAME] | [ProcedureCallTarget:FULLY_QUALIFIED_NAME]))
This gets rejected by the validator: "Mixing a cross-reference with other types is not supported. Consider splitting property "procedure" into two or more different properties."
I don't quite understand why it gets rejected. Semantically, it seems to be the same as in attempt 3. The construct that Langium doesn't seem to support is mixing reference types (e.g. ProcedureTypeA | ProcedureTypeB). But different terminal types (e.g. SIMPLE_NAME | FULLY_QUALIFIED_NAME) don't seem to be pose a problem for the Langium runtime.
Langium version: 3.4.0 Package name: https://registry.npmjs.org/langium/-/langium-3.4.0.tgz
The current behavior: Validator accepts attempt 2; Validator rejects attempt 4.
The expected behavior: Validator should reject attempt 2; Validator should accept attempt 4.
did you try
PCName: SIMPLE_NAME|FULLY_QUALIFIED_NAME;
ProcedureCall: (procedure=[ProcedureCallTarget:PCName])
Hey @dgDSA,
this is (mostly) working as designed. See also reference unions docs. The issue is that an assignment such as procedure=([ProcedureTypeA:SIMPLE_NAME] | [ProcedureTypeB:SIMPLE_NAME]) is ambiguous from a parser perspective. I.e. it's impossible to identify whether the parser is meant to consume a reference for ProcedureTypeA or one for ProcedureTypeB. Your "Attempt 3" is exactly how it is meant to be done.
However, your attempt 4 also should work as expected - the validation is likely overly eager to report an error in that instance (likely because we never anticipated it to be used like that).
@msujew Thanks for the link to the reference docs. Yes, your explanation of the ambiguity makes sense.
@cdietrich Thanks for the great suggestion, that makes it a lot more elegant.
As a reference for others, this is now the elegant grammar that works for us:
ReferenceByName returns string:
SIMPLE_NAME | FULLY_QUALIFIED_NAME;
type ProcedureCallTarget = ProcedureTypeA | ProcedureTypeB;
ProcedureCall:
(procedure=[ProcedureCallTarget:ReferenceByName])