mne-bids-pipeline icon indicating copy to clipboard operation
mne-bids-pipeline copied to clipboard

Temporally restricted channel interpolation

Open hoechenberger opened this issue 5 years ago • 14 comments

Probably needs to be solved upstream, but dropping it here for initial discussion

Currently, bad channel detection via find_bad_channels_maxwell() works on a global scale: if a user-defined threshold is exceeded frequently enough, the respective channel is marked as bad. It will typically be be interpolated via Maxwell filtering as the next step.

Now @SophieHerbst is running into two issues with this approach:

  • If flux spikes occur only rarely and / or their amplitude is only of "medium" size, the threshold will not be exceeded often enough to classify the channel as bad. Lowering the threshold (or required number the threshold must be exceeded), on the other hand, leads to many false-positives in other channels.
  • Flux spikes are a transient artifact; outside of those spikes, the channel may deliver usable data. However, currently this information will be lost if the entire channel is marked as bad and is subsequently interpolated.

A potential approach to tackle both issues could be to:

  1. Run a flux spike detection without global thresholding – just return the temporal location of the spikes
  2. Annotate those spikes, e.g. as FLUX_SPIKE, and keep track of the affected channel(s)
  3. Interpolate channel(s) during that time period

Thoughts?

hoechenberger avatar May 19 '20 09:05 hoechenberger

having a function to annotation flux jumps automatically would be nice !

@larsoner or @SherazKhan or @jasmainak any idea of how to do this?

is https://www.ncbi.nlm.nih.gov/pubmed/29448077 an approach to consider?

agramfort avatar May 19 '20 13:05 agramfort

or maybe TDDR for NIRS implements something useful. I have no clue

agramfort avatar May 19 '20 13:05 agramfort

I was thinking we could re-use the algorithm that find_bad_channels_maxwell() uses, just lower the threshold and report each time point where the threshold was exceeded

hoechenberger avatar May 19 '20 13:05 hoechenberger

We could let find_bad_channels_maxwell optionally:

return_scores : bool
    If True (default False), return the scores and time of the window edges used.

Returns
-----------
...
scores : ndarray, shape (n_meg, n_windows)
    The scores for the MEG channels. Only returned when ``return_scores`` is True.
bin_edges : ndarray, shape (n_windows + 1,)
    The window boundaries (in seconds) used to calculate the scores.

This would be half a dozen lines to add, and would enable time-varying correction, I think.

larsoner avatar May 19 '20 14:05 larsoner

@MerlinDumeur has coded some stuff early a few months back as part of the multifracs project and could probably use some help to integrate it to your efforts

virvw avatar May 19 '20 15:05 virvw

bin_edges : ndarray, shape (n_windows + 1,)
    The window boundaries (in seconds) used to calculate the scores.

I might be overly picky here but I'd love to be "closer" to the actual flux spike… the window is relatively large in comparison.

hoechenberger avatar May 19 '20 15:05 hoechenberger

@MerlinDumeur has coded some stuff early a few months back as part of the multifracs project and could probably use some help to integrate it to your efforts

👍👍👍

hoechenberger avatar May 19 '20 15:05 hoechenberger

I might be overly picky here but I'd love to be "closer" to the actual flux spike… the window is relatively large in comparison.

The method does not provide this information, so that would require more work / novel designs. You could decrease the window size to something quite a bit smaller, though (1 sec?) if you want. You could also use this as an estimate of the time, and do some peak finding and prominence estimation (easy with newest scipy find_peaks) after the fact. I don't think it should be the job of find_bad_channels_maxwell to take these additional steps, though.

larsoner avatar May 19 '20 18:05 larsoner

@MerlinDumeur has coded some stuff early a few months back as part of the multifracs project and could probably use some help to integrate it to your efforts

I actually worked on detecting flux jumps which are a bit different from spikes, and should be handled differently. However when it comes to finding flux spikes, I suggested using scipy.signal.find_peaks to @SophieHerbst, which she reported worked well for her, though she had to adjust the parameters to fit her data. You could use that function to get a precise location for all the peaks in the signal.

MerlinDumeur avatar May 19 '20 19:05 MerlinDumeur

I think detection is not the challenge, how do you interpolate without introducing artifacts? You'd need some subspace methods but they don't always work IMO

jasmainak avatar May 19 '20 20:05 jasmainak

@MerlinDumeur

I actually worked on detecting flux jumps which are a bit different from spikes, and should be handled differently.

Oh, I didn't know that. I've always used these terms interchangeably 🙈

@larsoner Maybe easiest for starters would be to follow your suggestion to return the affected windows from find_bad_channels_maxwell(), and then run the peak finder you and @MerlinDumeur mentioned to narrow down the time points of the flux spikes within each segment?

hoechenberger avatar May 19 '20 20:05 hoechenberger

@jasmainak

I think detection is not the challenge, how do you interpolate without introducing artifacts? You'd need some subspace methods but they don't always work IMO

Which kind of artifacts would we expect? DC jumps at the edges? Couldn't we simply remove these DC offsets, at least on one of the sides? (As you can probably tell, I'm new to this kind of stuff!)

hoechenberger avatar May 19 '20 20:05 hoechenberger

yes, you'll have sharp jumps at the edges. Then if you filter it, you're screwed

jasmainak avatar May 19 '20 20:05 jasmainak

@larsoner @hoechenberger Unfortunately, even the detection of bad sensors remains a problem for me.
Here is an example file (downsampled to 100Hz for speed):

It is admittedly not the cleanest data: there are 2 channels that have large square-like spikes: MEG0621 & MEG0341. The data also has a very noisy segment at the beginning when the recording was started, and some generally noisy sensors.

The algorithm detects:

Static bad channels: ['MEG0423', 'MEG0433', 'MEG1043', 'MEG1042', 'MEG1041'] Static flat channels: ['MEG0213', 'MEG1532']

As @MerlinDumeur said, I tried to run the scipy peak finder on the first derivative of my raw data to find sharp edges, which works for single subjects, but still requires too much individual tweaking of the threshold parameters for each dataset.

Example code:

import mne
# read raw data
raw = mne.io.read_raw_fif('bad_sensors_example.fif',
                          allow_maxshield=True,
                          preload=True, verbose='error')
  
# find bad channels through auto detect
auto_noisy_chs, auto_flat_chs = mne.preprocessing.find_bad_channels_maxwell(raw,
      limit = 7., min_count=3,
      verbose=True)

# add them all
all_bads = auto_noisy_chs + auto_flat_chs
all_bads = list(set(all_bads))
raw.info['bads'].extend(all_bads)

raw.pick(['meg']).plot(scalings={'grad': 500e-13, 'mag': 500e-15}, duration=10) 

SophieHerbst avatar May 20 '20 09:05 SophieHerbst