bids-validator icon indicating copy to clipboard operation
bids-validator copied to clipboard

Thoughts on formalizing resources needed for validation steps.

Open rwblair opened this issue 3 years ago • 0 comments
trafficstars

Currently the validator validates based on a combination of regex and string.includes conditionals in javascript to determine what function should be run on what files/parts of files/groups of files.

What I propose is a file(s) that will specify a set of conditions by datatype, entity, suffix, and extension and a set of validation rules to apply to files that meet that criteria.

The validation environment would be preseded with a context or object with general information about the dataset. This context/object would also include the automatic compilation of associated metadata that follows the inheritance principle, as well as relevant sibling files. So this full validation environment compiled for each file might look like this in json:

{   
    "json": { <actual contents of merged json sidecars> },
    "nii": { <non json files could have serializers to extract validation relevant info and xform it into json, in this case header info> }
    "tsv": { <event files for niftis, again serialized into json, maybe not all of the file as they can be quite large.> },
    // global context:
    "dataset": {
        "files": <list of all files in dataset?>
        <info derived from top level files or injected from some other source.>
    }
}

We could then validate it as we do other json files with json schemas. The condition that triggers this environment compilation and rules to be applied to it might look something like this:

{   
    "condition": {
        "suffix": ["bold"],
        "extension": ["nii"]
    },
    "rules": {
        // list of schemas and which parts of environment they should apply to.
        "load": {
            // multiple schemas would be merged
            "json": ["json/schemas/bold.json", "more_specific_schema_to_our_condition.json],
            "nii": ["serialized_nifti.json"],
            "root": ["<schmea that dips into multiple parts of the validation environment>.json"]
        }
        // or we could pass the environement to a js function:
        "execute": [
            "non_schema_validatable.js"
        ]
    }
}

We might imagine nesting further conditions and rule sets inside other rules objects to further refine what gets validated and how.obviously so json centric a thing is at odds with yaml work done so far. maybe this just stays a validator thing.

rwblair avatar Dec 15 '21 21:12 rwblair