bids-specification
bids-specification copied to clipboard
[SCHEMA] Reorganize how datatypes and suffixes are represented in loaded objects
Closes None, but continues from discussion in this week's schema/validator meeting. I'm going to leave this as a draft, because I want to know if folks think this reorganization is worthwhile.
Changes proposed:
- Update
schemacode.schema.load_schema()to represent both datatypes and suffixes in an accessible manner.
Currently
Suffixes remain grouped by patterns of entities and extensions within each datatype.
from pprint import pprint
from schemacode import utils, schema
path = utils.get_schema_path()
schobj = schema.load_schema(path)
pprint(schobj["rules"]["datatypes"]["beh"])
[{'entities': {'acquisition': 'optional',
'recording': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv.gz', '.json'],
'suffixes': ['stim', 'physio']},
{'entities': {'acquisition': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv', '.json'],
'suffixes': ['events', 'beh']}]
Viewing datatypes
Suffixes now have individual entries within the ["rules"]["datatypes"][datatype] dictionary.
pprint(schobj["rules"]["datatypes"]["beh"])
{'beh': [{'entities': {'acquisition': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv', '.json']}],
'events': [{'entities': {'acquisition': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv', '.json']}],
'physio': [{'entities': {'acquisition': 'optional',
'recording': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv.gz', '.json']}],
'stim': [{'entities': {'acquisition': 'optional',
'recording': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv.gz', '.json']}]}
Viewing suffixes
Suffixes are now directly accessible in the ["rules"]["suffixes"] dictionary, with individual valid patterns represented within in a list.
pprint(schobj["rules"]["suffixes"]["events"])
[{'datatypes': ['beh'],
'entities': {'acquisition': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv', '.json']},
{'datatypes': ['eeg', 'ieeg', 'meg'],
'entities': {'acquisition': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.json', '.tsv']},
{'datatypes': ['func'],
'entities': {'acquisition': 'optional',
'ceagent': 'optional',
'direction': 'optional',
'reconstruction': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required'},
'extensions': ['.tsv', '.json']},
{'datatypes': ['pet'],
'entities': {'reconstruction': 'optional',
'run': 'optional',
'session': 'optional',
'subject': 'required',
'task': 'required',
'tracer': 'optional'},
'extensions': ['.tsv', '.json']}]
I was also working on this file following up on last week's addition of tabular_metadata.yaml — but I think you're trying to do something different. Not as a criticism, but for my understanding since we might end up having clashing edits: why would this be necessary?
I think that we do need to reorganize, or at least formalize, the loaded schema object within the schemacode code, but I wouldn't say that this particular proposal is necessary. My hope is that the proposed changes would make navigating the schema easier, by making suffixes more easily accessible and splitting up the datatype rules so that they are organized more like how we would organize the YAML files if they weren't written for humans.