bids-specification
bids-specification copied to clipboard
make inheritance applicability explicit for `scan.json`
@robertoostenveld in https://github.com/bids-standard/bids-specification/issues/538#issuecomment-669749360 mentions that "using the inheritance principle there can be a scans.json at the top level". Current inheritance principle formulation seems to suggest that too, but documentation nowhere mentions scans.json
although it could be quite a common use case.
ATM I see only a single dataset with a (broken) scans.json
(git)smaug:/mnt/datasets/datalad/crawl/openneuro[master]git
$> ls -ls ds00*/*scans.json
4 -rw------- 1 yoh datalad 656 Oct 23 2020 ds003242/scans.json
and which is
$> cat ds003242/.bidsignore
scans.json
possibly likely because prior versions of bids-validator choked on it.
So I wonder if we should add scans.json
- as an explicit example to "Inheritance principle" section
- to https://github.com/bids-standard/bids-specification/blob/master/src/schema/top_level_files.yaml (attn @bids-standard/raw-electrophys-ieeg ?
So I wonder if we should add scans.json to https://github.com/bids-standard/bids-specification/blob/master/src/schema/top_level_files.yaml (attn raw-electrophys-ieeg)
The top level files schema file says:
This file describes files which may appear at the top level of a dataset. This does not include information about whether these files are required or optional. For that information, see
rules/top_level_files.yaml
.
It contains files like:
- README
- LICENSE
- CHANGES
- dataset_description
however, due to inheritance many other files could be placed there ... for example a task-rest_eeg.json
file. This is currently not part of the schema in this way. Not sure if that's a shortcoming or a good thing ... but if we were to add scans.json
there, we'd surely have to add all those other files as well for consistency - WDYT?
as an explicit example to "Inheritance principle" section
I feel like with the recent clarifications in https://github.com/bids-standard/bids-specification/pull/946 this may not be so crucial anymore. But if you have a good idea for an example, why not add it :+1:
I think the same applies to sessions.json
.
events.json
too....
I agree with both of you Robert and Kay - but what do you think: Should we add those to the top_level_files
schema ... or rather not, because the only time they may ever be placed there is when inheritance applies.
I can see both solutions. What's your opinion @tsalo?
I am not sure what putting it in the schema would look like or what the trade-offs are.
For example, a filename for one of these files can't have a sub
entity or ses
entity as part of its name, right?
I am reticent to directly encode the inheritance principle into the schema. At the moment, .json
is treated as an extension like any other, rather than as a special sidecar file. If we want to encode the inheritance principle, we basically need to have a separate entry for each suffix group just for .json
s, and then we need to make just about every required entity in those entries optional.
For example, a filename for one of these files can't have a
sub
entity orses
entity as part of its name, right?
It could have sub
and ses
, as long as it appears in sub-X/ses-Y/sub-X_ses-Y_scans.<tsv|json>
. I suppose it could have ses
, but not sub
, too.
I think the user story - linked to which the implementational changes are to be refined - is "as a user I want a dataset with inheritance of the scans.json/sessions.json
at the top level to pass the validator (and events.json, but I think that already works).
To refine this user story, I suggest to make a test case, see whether the validator breaks.
- if not, add it to the validator test suite for regression testing
- if it breaks, fix the validator (which might mean "extend the schema"), and then add to the test suite