schema_salad
schema_salad copied to clipboard
schema-salad-tool throws an exception with schemas with a union of enum arrays during document parsing time
Although schemas with a union of enum arrays are valid, schema-salad-tool cannot parse documents with such schemas and throws an exception due to the schema rather than the input document.
If it is a bug of schema-salad-tool, I will send a pull request for it.
Inputs
- enum-arrays.yml
$graph:
- name: Enum1
type: enum
symbols: [a]
- name: Enum2
type: enum
symbols: [b]
- name: Test
documentRoot: true
type: record
fields:
field:
type:
- type: array
items: Enum1
- type: array
items: Enum2
It is a valid schema as shown in the following command:
$ schema-salad-tool schema_salad/schema_salad/metaschema/metaschema.yml enum-arrays.yml
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Document `enum-arrays.yml` is valid
Therefore we expect that schema-salad-tool can parse the following document with the above schema:
- doc.yml
field: [a]
We expect doc.yml
can be parsed with enum-arrays.yml
.
$ schema-salad-tool enum-arrays.yml doc.yml
Expected behavior
It succeeds its execution as follows:
$ schema-salad-tool enum-arrays.yml doc.yml
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Document `doc.yml` is valid
Actual behavior
It fails validation due to the error of the schema although it is valid:
$ schema-salad-tool enum-arrays.yml doc.yml
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Schema `enum-arrays.yml` error:
Union item must be a valid Avro schema: {'name': 'workspaces.cwltest.enum-arrays.yml.Test', 'documentRoot': True, 'type': 'record', 'fields': [{'type': [{'type': 'array', 'items': 'workspaces.cwltest.enum-arrays.yml.Enum1', 'name': ''}, {'type': 'array', 'items': 'workspaces.cwltest.enum-arrays.yml.Enum2', 'name': ''}], 'name': 'field'}]}
I found that a union of array types is not permitted in Avro Schema.
Does this restriction exist in Schema Salad? If so, the above behavior is intended by the spec but it is too restrictive, IMO.
I want to improve cwltest-schema.yml not to specify required
and process requirements simultaneously as follows.
$base: "https://w3id.org/cwl/cwltest#"
$graph:
- name: Required
type: enum
symbols: [required]
- name: ProcessType
type: enum
symbols: [command_line_tool, workflow, expression_tool]
- name: ProcessRequirement
type: enum
symbols: [docker, env, ...]
- name: TestCase
type: record
documentRoot: true
fields:
...
tags:
type:
- type: array
items: [Required, ProcessType] # no process requirements must be specified if `required` is specified
- type: array
items: [ProcessType, ProcessRequirement]
...
Currently such improvement is blocked by this issue.