bids-validator icon indicating copy to clipboard operation
bids-validator copied to clipboard

Implement JS readers for data formats: EDF, BDF, BrainVision, ...

Open sappelhoff opened this issue 4 years ago • 5 comments

following quote by @arnodelorme:

This BIDS dataset contains both .edf and .bdf file (which are very small): https://openneuro.org/datasets/ds002034/versions/1.0.1

sub-01/ses-01/eeg/sub-01_ses-01_task-offline_run-01_eeg.edf sub-01/ses-01/eeg/sub-01_ses-01_task-offline_run-01_eeg.bdf

I believe it should not have passed the validator since there are 2 types of binary files and the BDF file is obviously corrupted.

Originally posted by @sappelhoff in https://github.com/bids-standard/bids-validator/issues/1107#issuecomment-722223434


I haven't checked whether the BDF file is corrupted, but if it truly is, that raises another, already known, concern: We are not validating the contents of binary EEG files.

This problem is hard to solve, because we would need to implement data format readers in Javascript. So that the bids-validator can go into the files and check for their validity. Currently, this is already being done for NIfTI files (and only for NIfTI files).

I tried many months ago to implement a reader/validator for the BrainVision format using Javascript here: https://github.com/sappelhoff/brainvision-validator/ ... see also #475

However, I ran into problems integrating it with the bids-validator, because it runs both on the browser, and the CLI. --> and the "file access" API for the browser is significantly different and more complicated than accessing files from the CLI (or from programs written in Matlab or Python).

But I will open this post as a separate issue and we certainly should address it as soon as we have some resources available. (And with resources, I mean people who have expertise, energy, and time)

I discussed this many times with @jasmainak as well, maybe he remembers other issues or conversations to link to.

sappelhoff avatar Nov 05 '20 08:11 sappelhoff

When doing things like this, we need to be careful to write things in such a way as to only peek at the portions needed. If we have to ingest entire files, then this is going to make it very hard to do validation on remote datasets, such as datalad datasets with most large files left on S3.

effigies avatar Nov 05 '20 13:11 effigies

When doing things like this, we need to be careful to write things in such a way as to only peek at the portions needed.

with NIfTI, is this solved by only looking at the header files?

This may not be so bad, because implementing basic "sanity" checks for data formats is a lot easier than full fledged readers.

sappelhoff avatar Nov 05 '20 14:11 sappelhoff

Just the headers, yes. NIfTI has a fixed header, which makes it pretty straightforward.

effigies avatar Nov 05 '20 14:11 effigies

I have code for ingesting EDF and fif in javascript. I am sure I can dig out the relevant links if you're interested in following this up @sappelhoff ... I don't have a lot of bandwidth but certainly interested in a "javascript nibabel for ephys"

jasmainak avatar Nov 09 '20 06:11 jasmainak

I don't have a lot of bandwidth but certainly interested in a "javascript nibabel for ephys"

now we are two :heart_eyes: But if that code is online, it'd help to have it linked here for the future.

I think one of the biggest barriers right now is a lack of instructions on

"how to write a bids-validator plugin JS library" ... specifically a plugin library that can access files (e.g., for their headers) that comes back to https://github.com/bids-standard/bids-validator/issues/932 but goes also beyond it

sappelhoff avatar Nov 09 '20 08:11 sappelhoff