bids-specification icon indicating copy to clipboard operation
bids-specification copied to clipboard

[SCHEMA] Reorganize how datatypes and suffixes are represented in loaded objects

Open tsalo opened this issue 3 years ago • 2 comments

Closes None, but continues from discussion in this week's schema/validator meeting. I'm going to leave this as a draft, because I want to know if folks think this reorganization is worthwhile.

Changes proposed:

  • Update schemacode.schema.load_schema() to represent both datatypes and suffixes in an accessible manner.

Currently

Suffixes remain grouped by patterns of entities and extensions within each datatype.

from pprint import pprint
from schemacode import utils, schema

path = utils.get_schema_path()
schobj = schema.load_schema(path)

pprint(schobj["rules"]["datatypes"]["beh"])
[{'entities': {'acquisition': 'optional',
               'recording': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required'},
  'extensions': ['.tsv.gz', '.json'],
  'suffixes': ['stim', 'physio']},
 {'entities': {'acquisition': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required'},
  'extensions': ['.tsv', '.json'],
  'suffixes': ['events', 'beh']}]

Viewing datatypes

Suffixes now have individual entries within the ["rules"]["datatypes"][datatype] dictionary.

pprint(schobj["rules"]["datatypes"]["beh"])
{'beh': [{'entities': {'acquisition': 'optional',
                       'run': 'optional',
                       'session': 'optional',
                       'subject': 'required',
                       'task': 'required'},
          'extensions': ['.tsv', '.json']}],
 'events': [{'entities': {'acquisition': 'optional',
                          'run': 'optional',
                          'session': 'optional',
                          'subject': 'required',
                          'task': 'required'},
             'extensions': ['.tsv', '.json']}],
 'physio': [{'entities': {'acquisition': 'optional',
                          'recording': 'optional',
                          'run': 'optional',
                          'session': 'optional',
                          'subject': 'required',
                          'task': 'required'},
             'extensions': ['.tsv.gz', '.json']}],
 'stim': [{'entities': {'acquisition': 'optional',
                        'recording': 'optional',
                        'run': 'optional',
                        'session': 'optional',
                        'subject': 'required',
                        'task': 'required'},
           'extensions': ['.tsv.gz', '.json']}]}

Viewing suffixes

Suffixes are now directly accessible in the ["rules"]["suffixes"] dictionary, with individual valid patterns represented within in a list.

pprint(schobj["rules"]["suffixes"]["events"])
[{'datatypes': ['beh'],
  'entities': {'acquisition': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required'},
  'extensions': ['.tsv', '.json']},
 {'datatypes': ['eeg', 'ieeg', 'meg'],
  'entities': {'acquisition': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required'},
  'extensions': ['.json', '.tsv']},
 {'datatypes': ['func'],
  'entities': {'acquisition': 'optional',
               'ceagent': 'optional',
               'direction': 'optional',
               'reconstruction': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required'},
  'extensions': ['.tsv', '.json']},
 {'datatypes': ['pet'],
  'entities': {'reconstruction': 'optional',
               'run': 'optional',
               'session': 'optional',
               'subject': 'required',
               'task': 'required',
               'tracer': 'optional'},
  'extensions': ['.tsv', '.json']}]

tsalo avatar Feb 17 '22 22:02 tsalo

I was also working on this file following up on last week's addition of tabular_metadata.yaml — but I think you're trying to do something different. Not as a criticism, but for my understanding since we might end up having clashing edits: why would this be necessary?

TheChymera avatar Feb 24 '22 09:02 TheChymera

I think that we do need to reorganize, or at least formalize, the loaded schema object within the schemacode code, but I wouldn't say that this particular proposal is necessary. My hope is that the proposed changes would make navigating the schema easier, by making suffixes more easily accessible and splitting up the datatype rules so that they are organized more like how we would organize the YAML files if they weren't written for humans.

tsalo avatar Feb 24 '22 19:02 tsalo