QUIPP-pipeline icon indicating copy to clipboard operation
QUIPP-pipeline copied to clipboard

Validate input-files with Pydantic and add CLI for validation

Open OscartGiles opened this issue 4 years ago • 0 comments

Summary (WIP)

Warning ⚠️ ☢️ Because I only have implemented this for CTGAN merging to develop will break the whole pipeline. So should only merge once/if every input type has a schema implemented. Also requires additional info for CTGAN for complete validation

I was going to point this at https://github.com/alan-turing-institute/QUIPP-pipeline/pull/45 but as it's behind develop its impossible to see diffs.

  • Adds python module schema for defining schema for input-json files using Pydantic. Pydantic can also generate JSON Schema
  • Validating JSON files in synthesize.py using the module
  • Add a simple CLI to validate input files (see reviewer notes)
  • CLI can template the form created by @ots22 with the JSON Schema from pydantic

Dependencies

I added these to env-configurations but maybe this is not the best place. If we like the CLI maybe it should be a package.

Reviewer notes

Test out the CLI:

Docs

python quipp_cli.py --help      

Validate an input file (mess with the file to see errors)

python quipp_cli.py validate run-inputs/ctgan-example-0.json  

But only implemented for CTGAN at the moment (see error)

python quipp_cli.py validate run-inputs/sgf-example-0.json          

Get JSON Schema for a given method

python quipp_cli.py schema CTGAN           

Create an input file in a browser (from @ots22 )

python quipp_cli.py create CTGAN

May want better command names. Because I haven't added any info for validating beyond field names and types the webform is quite minimal.

I thought one form per method type might be best? Thoughts @ots22 ?

OscartGiles avatar Jan 15 '21 14:01 OscartGiles