schema_salad icon indicating copy to clipboard operation
schema_salad copied to clipboard

Undocumented behavior in the case of unions of typeDSL-ed optional types

Open tom-tan opened this issue 3 years ago • 1 comments

Here is a type definition of SaladRecordField#jsonldPredicate.

https://github.com/common-workflow-language/schema_salad/blob/7662c7f16b1642bcc267ff8f9095c4c5220c329f/schema_salad/metaschema/metaschema.yml#L284-L287

According to the spec of SALAD, string? and JsonldPredicate? are expanded with the typeDSL rule. It says:

If the type ends with a question mark ?, the question mark is stripped off and the type is expanded to a union with null

The definition of union types are in the spec of Apache Avro. It says:

Unions, as mentioned above, are represented using JSON arrays.

By naively interpreting the descriptions, the above type definition is processed to the following:

 - name: jsonldPredicate 
   type: 
     - ["null", "string"]
     - ["null", "JsonldPredicate"]

However, it violates the definition of SaladRecordField#type.

schema-salad-tool processes the above example to the following but this behavior is not documented in the spec of SALAD and Avro.

 - name: jsonldPredicate 
   type: 
     - "null"
     - string
     - JsonldPredicate

If it is correct, it would be nice if there is a description of this behavior.


IMO, the current behavior causes extra complexity to the parsers (especially the case of combining with $import that is replaced in the array node) while it only provides a tiny syntax sugar :-(.

tom-tan avatar Nov 10 '22 16:11 tom-tan

The current behavior, as implemented, is to flatten nested arrays and remove duplicates.

https://github.com/common-workflow-language/schema_salad/blob/cd1640845caf4f51f6799188a5422d57e1b63d72/schema_salad/ref_resolver.py#L707-L727

This was implemented as a "do what I mean" behavior, optional types are turned into unions, if you nest a union within a union, the most convenient/intended behavior is to flatten it to a single list. Are you proposing that it should throw an error instead?

I can provide a specification of the flattening/deduplication behavior.

tetron avatar Dec 05 '22 21:12 tetron