mne-bids icon indicating copy to clipboard operation
mne-bids copied to clipboard

Support reading arbitrary _event.tsv columns

Open behinger opened this issue 3 years ago • 11 comments

Describe the problem

When loading a BIDS dataset, currently the columns onset duration, key and value are extracted from _events.tsv files and saved to raw.annotation.

Key + value are combined to key/value if I understand correctly.

Other columns are ignored. But those other columns are explicitly allowed in the BIDS specification:

An arbitrary number of additional columns can be added.

Describe your solution

The solution would be to load/save all columns as a Pandas DataFrame (similar afaik to MNE metadata) instead of raw.Annotation. I do not know if raw.Annotation allows for other arbitrary columns - but I remember that it is somewhat an ongoing discussion between events/annotation/metadata. This is what I am currently doing, using load/overwriting the BIDS events file manually.

My use-case is EEG/Eye-Tracking data, which typically has tens of additional columns (e.g. for saccade-events: saccade amplitude, orientation, velocity, size etc.).

This also applies to writing events.tsv, but I haven't looked into it again for this issue.

behinger avatar Jul 16 '21 09:07 behinger

Hello @behinger,

I'm not entirely sure about the intended use case here. Are you saying you'd like to have a way to essentially retrieve everything stored in _events.tsv? If so, we could add a method to BIDSPath that returns a pd.DataFrame of the contents of .tsv sidecars.

Metadata in MNE is only available for Epochs; and I'm not convinced adding "tens of columns" worth of data as Annotations to Raw data either.

WDYT?

hoechenberger avatar Jul 16 '21 09:07 hoechenberger

I intend to use the data with the rERP approach with deconvolution, where it is very common to have several different events, each with multiple predictors, importantly on the continuous EEG.

For simple 2*2 designs, annotations might be enough, but for regressions etc it seems quite unwieldy, so I agree.

Your solution seems good to me, and indeed I want to have everything of the events.tsv

But maybe there is a larger issue, that if you would read and write data using mne-bids, all these eventscolumns would be missing (without intervention) because mne cannot represent all bids information.

behinger avatar Jul 16 '21 10:07 behinger

What do you suggest?

I would read the events.tsv file with pandas and do my rERP from this. You don’t need mne.Epochs so metadata is not relevant.

agramfort avatar Jul 16 '21 12:07 agramfort

I don't know because I don't know enough about mne.Annotations - if it is a fundamental limit of MNE-annotations to not support any other columns, maybe there is nothing to be done here. And certainly, I don't want to open the can-of-worms of events/annotations/metadata in this issue. Is mne.Annotations extensible to other fields? That would be a natural solution.

Many "raw" (straight from recorder) BIDS datasets have only Trigger-Codes, so for those, the current situation is fine. But I have seen BIDS datasets that make use of the custom event fields to identify "normal" conditions for epoching as well. It is therefore not only a problem for rERP, but for any dataset that uses optional BIDS fields and where users want to epoch according to these fields.

behinger avatar Jul 16 '21 13:07 behinger

MNE Annotations are essentially a bunch of lists containing onset, duration, and label or description of all annotated segments. Theoretically, you / we could generate annotations for all columns in an events.tsv file. We now also have Annotations.to_data_frame(), allowing you to easily retrieve all events in a tabular form. So yeah this seems feasible to me.

hoechenberger avatar Jul 16 '21 13:07 hoechenberger

I would need to get access to a dataset that has this to feel the problem. You can always read in the events with pandas.read_csv and then pass this to Epochs metadata on construction. It's a few more lines but I don't see a blocker here.

agramfort avatar Jul 17 '21 06:07 agramfort

As an easy-accessible example: https://openneuro.org/datasets/ds002680/versions/1.0.0

This one has ReactionTimes & Stimulus-ID as "extra" columns. grafik


As I said before, I, personally, know how to get around this problem. Making the case one last time:

  1. This is a (optional) feature of BIDS, I expected this to be supported when first using mne-bids
  2. Keeping the information in one object (e.g. raw.annotations) seems the intuitive place for this information
  3. I think writing BIDS from mne should not require manually overwriting the events.tsv (afaik events_data in write_raw_bids has to be shape (n_events,3) right now)

behinger avatar Jul 20 '21 10:07 behinger

Thanks for linking to the example dataset! I will take a look at this. 🤔

hoechenberger avatar Jul 20 '21 10:07 hoechenberger

@behinger I'm a little busy these days, please feel free to ping me sometime later this week should you not see any activity here :)

hoechenberger avatar Jul 26 '21 08:07 hoechenberger

ping :)

behinger avatar Aug 02 '21 06:08 behinger

could we have a function read_bids_events ? This can be used internally by read_raw_bids but also exposed to the user if they want better control

jasmainak avatar Aug 30 '21 19:08 jasmainak