Eyetracking data: Need a function to identify epochs with excessive blinks
Describe the new feature or enhancement
In eyetracking experiments we often have participants that, for reasons such as fatigue and higher sensitivity to stimuli, blinked excessively in some trials. In some cases, if a participant blinks for more than 40% of the trial's total duration, the entire trial becomes unusable even after interpolation and need to be excluded.
It'd be much more convenient if we can have a function that automate trial exclusion, or add in the mne.Epochs() function an additional reason to drop epochs with too many missing data.
Describe your proposed implementation
Additional feature in mne.Epochs(), where trials with a given percentage of total duration marked as missing data (e.g. 'BAD_blink' or 'blink') are automatically rejected
Describe possible alternatives
A new function like reject_missing_data() that operates on Epoched data
Additional context
Currently after epoching, each epoch has its own timestamp relative to its baseline (if a baseline is given), but the annotations still keep their timestamps in raw format. This adds to the workload when users need a customised function that achieves blink-based trial rejection as most users are not very proficient in MNE-Python's underlying data structure
A couple possible approaches come to mind:
- compute blink events, interpolate, epoch, determine which epochs to drop based on how many blink events occur within the epoch
- we split the functionality of
mne.preprocessing.eyetracking.interpolate_blinksinto two steps: detect blinks and replace withNaNinstead of interpolating, then later (possibly doing other stuff in between) replace NaNs with interpolated values. That way you could do blinks→NaN, then reject epochs based on percentage of NaN values, then (after dropping epochs) NaN→interpolated values
A couple possible approaches come to mind:
- compute blink events, interpolate, epoch, determine which epochs to drop based on how many blink events occur within the epoch
- we split the functionality of
mne.preprocessing.eyetracking.interpolate_blinksinto two steps: detect blinks and replace withNaNinstead of interpolating, then later (possibly doing other stuff in between) replace NaNs with interpolated values. That way you could do blinks→NaN, then reject epochs based on percentage of NaN values, then (after dropping epochs) NaN→interpolated values
good points @drammock, just a few thoughts about the two approaches
approach 1 would circle back to what I mentioned in the 'additional context' part - epochs have their own timestamps but annotations are not attached to their corresponding epochs, which would require quite some coding skills and familiarity with mne structures to determine which blink event belong to which epoch
approach 2 would not be feasible. mne.Epochs() doesn't seem to cope well with NAN values and I've encountered cases where NAN's propagated to the entire trial after epoching (cf. https://mne.discourse.group/t/epoching-eyetracking-data-caused-pupil-data-to-be-nand/11433). Hence interpolations must occur before epoching, at least for eyetracking data.
I'm still working on some custom coding to see if I can get hold of this problem, but I'm really not an expert in coding...
Here's some code to get you started with option 1:
import mne
import numpy as np
raw_fpath = mne.datasets.sample.data_path() / "MEG" / "sample" / "sample_audvis_raw.fif"
raw = mne.io.read_raw_fif(raw_fpath, preload=False)
events = mne.find_events(raw)
blink_events = mne.preprocessing.find_eog_events(raw)
# only care about one event type (for demo)
events = events[events[:, -1] == 1]
# make epochs
tmin = -0.5
tmax = 1
event_dict = {"auditory/left": 1}
epochs = mne.Epochs(raw, events=events, event_id=event_dict, tmin=tmin, tmax=tmax)
# get the times of the epoch boundaries
epoch_tzeros = raw.times[events[:, 0] - raw.first_samp]
epoch_t_starts = epoch_tzeros + tmin
epoch_t_ends = epoch_tzeros + tmax
# get the times of the blink events
blink_event_times = raw.times[blink_events[:, 0] - raw.first_samp]
# count number of blinks per epoch
n_blinks_per_epoch = list()
for t_start, t_end in zip(epoch_t_starts, epoch_t_ends):
n_blinks_per_epoch.append(
np.count_nonzero((blink_event_times >= t_start) & (blink_event_times < t_end))
)