schema_salad icon indicating copy to clipboard operation
schema_salad copied to clipboard

schema-salad-tool throws an exception with schemas with a union of enum arrays during document parsing time

Open tom-tan opened this issue 2 years ago • 1 comments

Although schemas with a union of enum arrays are valid, schema-salad-tool cannot parse documents with such schemas and throws an exception due to the schema rather than the input document.

If it is a bug of schema-salad-tool, I will send a pull request for it.

Inputs

  • enum-arrays.yml
$graph:
  - name: Enum1
    type: enum
    symbols: [a]
  - name: Enum2
    type: enum
    symbols: [b]
  - name: Test
    documentRoot: true
    type: record
    fields:
      field:
        type:
          - type: array
            items: Enum1
          - type: array
            items: Enum2

It is a valid schema as shown in the following command:

$ schema-salad-tool schema_salad/schema_salad/metaschema/metaschema.yml enum-arrays.yml 
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Document `enum-arrays.yml` is valid

Therefore we expect that schema-salad-tool can parse the following document with the above schema:

  • doc.yml
field: [a]

We expect doc.yml can be parsed with enum-arrays.yml.

$ schema-salad-tool enum-arrays.yml doc.yml 

Expected behavior

It succeeds its execution as follows:

$ schema-salad-tool enum-arrays.yml doc.yml 
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Document `doc.yml` is valid

Actual behavior

It fails validation due to the error of the schema although it is valid:

$ schema-salad-tool enum-arrays.yml doc.yml 
/home/vscode/.local/bin/schema-salad-tool Current version: 8.2.20220204150214
Schema `enum-arrays.yml` error:
Union item must be a valid Avro schema: {'name': 'workspaces.cwltest.enum-arrays.yml.Test', 'documentRoot': True, 'type': 'record', 'fields': [{'type': [{'type': 'array', 'items': 'workspaces.cwltest.enum-arrays.yml.Enum1', 'name': ''}, {'type': 'array', 'items': 'workspaces.cwltest.enum-arrays.yml.Enum2', 'name': ''}], 'name': 'field'}]}

tom-tan avatar May 14 '22 12:05 tom-tan

I found that a union of array types is not permitted in Avro Schema.

Does this restriction exist in Schema Salad? If so, the above behavior is intended by the spec but it is too restrictive, IMO.

I want to improve cwltest-schema.yml not to specify required and process requirements simultaneously as follows.

$base: "https://w3id.org/cwl/cwltest#"
$graph:
  - name: Required
    type: enum
    symbols: [required]
  - name: ProcessType
    type: enum
    symbols: [command_line_tool, workflow, expression_tool]
  - name: ProcessRequirement
    type: enum
    symbols: [docker, env, ...]
  - name: TestCase
    type: record
    documentRoot: true
    fields:
      ...
      tags:
        type:
          - type: array
            items: [Required, ProcessType] # no process requirements must be specified if `required` is specified
          - type: array
            items: [ProcessType, ProcessRequirement]
      ...

Currently such improvement is blocked by this issue.

tom-tan avatar May 20 '22 10:05 tom-tan