cwl-utils
                                
                                 cwl-utils copied to clipboard
                                
                                    cwl-utils copied to clipboard
                            
                            
                            
                        schema_salad.exceptions.ValidationExceptions parsing certain data structures
import cwl_utils.parser_v1_2 as parser
parser.load_document("https://raw.githubusercontent.com/common-workflow-library/bio-cwl-tools/release/picard/picard_CreateSequenceDictionary.cwl")
  InitialWorkDirRequirement:
    listing:
      - $(inputs.REFERENCE)
outputs:
  ...
  sequences_with_dictionary:
    type: File
    format: edam:format_2573  # SAM
    secondaryFiles:
      - .dict
      - .fai
parser.load_document("https://raw.githubusercontent.com/common-workflow-library/bio-cwl-tools/release/bamtools/bamtools_stats.cwl")
bamtools_stats.cwl:110:5:  the `outputBinding` field is not valid...
the `outputEval` field is not valid...
secondaryFiles within outputs is not terribly common but listing should accept that expression
https://github.com/common-workflow-language/schema_salad/blob/main/schema_salad/tests/test_schema/CommandLineTool.yml#L776-L781
Probably not related, but you are testing with a v1.2 parser even though document declares cwlVersion: 1.2
@leipzig so what's actually happening here is that codegen was never updated for the schema salad v1.1, which added a few features that are used for CWL 1.1 and 1.2. Specifically:
- secondaryFilesDSL
- Special treatment of the Expressiontype
- subscopes (used for assigning identifiers)
- default values
The first two are causing the problems you're reporting. Some suggestions:
- You need to add a case for secondaryFilesDSL, see https://github.com/common-workflow-language/schema_salad/blob/main/schema_salad/codegen.py#L113 and https://www.commonwl.org/v1.1/SchemaSalad.html#Domain_Specific_Language_for_secondary_files
- You need to add a special case for the Expressiontype https://github.com/common-workflow-language/schema_salad/blob/af9842edc3e55723bfd0c50562ab44c7daa35226/schema_salad/python_codegen.py#L285 and add an_ExpressionLoaderwhich should pretty much just check if the value is a string
- declare_fieldhttps://github.com/common-workflow-language/schema_salad/blob/af9842edc3e55723bfd0c50562ab44c7daa35226/schema_salad/python_codegen.py#L368 needs an extra- subscopeparameter, if the subscope is non-empty, then it gets appended to- baseuriwhen loading the field
- begin_classshould be extended to so that fields with- defaultor- optionalare sorted to the end of the- __init__parameters and assigned None, and then in the body of- __init__any fields with- defaultthat are None should be assigned the default value.
What you should do is make a PR against schema-salad that fixes these issues, and a PR against cwl-utils with updated v1.1 and v1.2 parsers and test cases of previously-failing v1.1 and v1.2 files.
@tetron I am a bit confused about number 3. Does it apply only to declare_id_field or declare_field as well. It seems the subscope argument is something that appears as when jsonldPredicate is a list, so it would be more relevant to declare_field
      jsonldPredicate:
        _id: "cwl:run"
        _type: "@id"
        subscope: run
but that when it's jsonldPredicate: "@id" we are talking about workflow id
@leipzig When the predicate is @id that means that field is the identifier for the object it appears in.
A field with _type: "@id" means it is a reference to an identifier.
A subscope means that jsonldPredicate: "@id" fields get an extra path element to avoid identifier name conflicts with the parent.  This is done by adding the subscope to the baseuri that is used to construct identifiers in the nested objects.
Does that help?
Should I just concatenate baseuri+subscope?
Pretty much.  Here's what ref_resolver.py does
subscope = ""  # type: str
if key in loader.subscopes:
    subscope = "/" + loader.subscopes[key]
document[key], _ = loader.resolve_all(
    val, base_url + subscope, file_base=file_base, checklinks=False
)
(the important part is that it also adds a slash)
@leipzig Are you still working on this? Is there a branch of what you've tried already?
@mr-c i have some progress on this branch https://github.com/leipzig/schema_salad/tree/main can you point me to a 1.2 example on hand with all of these features?
@leipzig Huzzah!
Here are some CWL v1.2 examples. courtesy @jdidion
https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/record-sd-secondaryFiles.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/anon_enum_inside_array_inside_schemadef.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/stage_file_array_basename_and_entryname.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/docker-array-secondaryfiles.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/optional-output.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/record-in-secondaryFiles.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/initialworkdirrequirement-docker-out.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/secondaryfiles/rename-outputs.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/stage_file_array.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/stage_file_array_basename.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/stage_array.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/initialworkdir-glob-fullpath.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/record-out-secondaryFiles.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/secondaryfiles/rename-inputs.yml https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/params.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/params2.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/schemadef-tool.cwl https://github.com/common-workflow-language/cwl-v1.2/blob/main/tests/imported-hint.cwl