schema_salad
schema_salad copied to clipboard
Undocumented behavior in the case of unions of typeDSL-ed optional types
Here is a type definition of SaladRecordField#jsonldPredicate.
https://github.com/common-workflow-language/schema_salad/blob/7662c7f16b1642bcc267ff8f9095c4c5220c329f/schema_salad/metaschema/metaschema.yml#L284-L287
According to the spec of SALAD, string? and JsonldPredicate? are expanded with the typeDSL rule.
It says:
If the type ends with a question mark
?, the question mark is stripped off and the type is expanded to a union withnull
The definition of union types are in the spec of Apache Avro. It says:
Unions, as mentioned above, are represented using JSON arrays.
By naively interpreting the descriptions, the above type definition is processed to the following:
- name: jsonldPredicate
type:
- ["null", "string"]
- ["null", "JsonldPredicate"]
However, it violates the definition of SaladRecordField#type.
schema-salad-tool processes the above example to the following but this behavior is not documented in the spec of SALAD and Avro.
- name: jsonldPredicate
type:
- "null"
- string
- JsonldPredicate
If it is correct, it would be nice if there is a description of this behavior.
IMO, the current behavior causes extra complexity to the parsers (especially the case of combining with $import that is replaced in the array node) while it only provides a tiny syntax sugar :-(.
The current behavior, as implemented, is to flatten nested arrays and remove duplicates.
https://github.com/common-workflow-language/schema_salad/blob/cd1640845caf4f51f6799188a5422d57e1b63d72/schema_salad/ref_resolver.py#L707-L727
This was implemented as a "do what I mean" behavior, optional types are turned into unions, if you nest a union within a union, the most convenient/intended behavior is to flatten it to a single list. Are you proposing that it should throw an error instead?
I can provide a specification of the flattening/deduplication behavior.