bids-validator icon indicating copy to clipboard operation
bids-validator copied to clipboard

EEG: multiple dataset files can be submitted with different extension

Open arnodelorme opened this issue 1 year ago • 7 comments

This dataset contains both .set and .vhdr files

https://nemar.org/dataexplorer/detail?dataset_id=ds003190

This should not be possible. There should be either one or the other.

arnodelorme avatar Feb 27 '24 23:02 arnodelorme

Relates to https://github.com/bids-standard/bids-specification/issues/1487

Unless I am mistaken enforcing this would depend on updating the schema (see https://github.com/bids-standard/bids-specification/pull/1492) for the deno based validator.

I doubt the legacy validator will enforce this.

Remi-Gau avatar Feb 28 '24 07:02 Remi-Gau

I don't think we have a rule in BIDS that the data format chosen for a dataset must be consistent :thinking:

although it is a bit weird to mix them in a single dataset.

sappelhoff avatar Feb 28 '24 09:02 sappelhoff

We do check whether someone has both .nii and .nii.gz files, as this is a somewhat common issue.

https://github.com/bids-standard/bids-validator/blob/932c782a556afea34346e994f655e03cf3e171fe/bids-validator/validators/nifti/duplicateFiles.js#L3-L34

A similar thing could be written for other formats, although it would be more complicated, because EEG has multi-file formats and files sharing the same stem isn't in itself an error.

effigies avatar Mar 06 '24 14:03 effigies

We do check whether someone has both .nii and .nii.gz files, as this is a somewhat common issue.

but does it say in the spec, that these shouldn't be mixed?

sappelhoff avatar Mar 07 '24 16:03 sappelhoff

No, but it does create ambiguity about the data and frequently problems for tools expecting to retrieve a unique data file for a collection of entities.

I would support making it an explicit part of the spec, though I would not complain if it happened after @Remi-Gau's schema changes were incorporated and supported by the schema validator.

effigies avatar Mar 08 '24 02:03 effigies

I think we had added this section in the spec, no?

https://bids-specification.readthedocs.io/en/stable/common-principles.html#uniqueness-of-data-files

But I need to get back to finishing the schema pr.

Remi-Gau avatar Mar 08 '24 05:03 Remi-Gau

I think we had added this section in the spec, no?

what we added there was:

  • there MUST NOT be data_a.jpg and data_a.tif in the same dataset

However, here we have data_a.jpg and data_b.tif as a situation 🤔

sappelhoff avatar Mar 08 '24 09:03 sappelhoff