mne-python
mne-python copied to clipboard
Compute t/f representation with variable-length epochs?
In many studies we conduct, people solve simple arithmetic problems. I am interested in EEG activity during the problem solving phase, which differs in length (1) from epoch to epoch and (2) from person to person.
These variable-length epochs are problematic, because I cannot simply compute a t/f representation with standard MNE functionality. Since I am mainly looking at oscillatory activity in specific frequency bands, my current workaround is to average band power over time within each epoch so that only one value per epoch remains. Of course, this means that I completely lose any time information, which is why this is a really crude approach.
EEGLAB has an option to "timewarp" individual epochs (https://github.com/eeglabdevelopers/eeglab/blob/develop/functions/timefreqfunc/newtimef.m#L144). I haven't really looked at the code so I'm not exactly sure how this works, but it seems to adequately address my problem. Does MNE have something similar? What do others do with such kind of data? I haven't seen any studies using the same variable-length epochs structure out there, and it seems like people just use some (arbitrary) fixed length cutoff. However, this is not ideal because I would like to make sure that only task-relevant brain activity gets into the calculation, so I don't want to cut off epochs at a fixed latency. Instead, I want my epochs to only contain EEG activity related to the problem solving process (i.e. from problem presentation up to a response).
Any ideas or sugggestions?
BTW, this is not about creating an object to hold variable-length epochs (this has been discussed in #3533).
These variable-length epochs are problematic, because I cannot simply compute a t/f representation with standard MNE functionality.
You can use the _array
functions and either pass them one at a time or (what I would try first) pad them all with 0 or NaN to be the length of the longest epoch.
... it seems like people just use some (arbitrary) fixed length cutoff. However, this is not ideal because I would like to make sure that only task-relevant brain activity gets into the calculation, so I don't want to cut off epochs at a fixed latency. Instead, I want my epochs to only contain EEG activity related to the problem solving process (i.e. from problem presentation up to a response).
Even if you obtain TF representations of length 10 s when some trials only have 5 sec of usable information, you can use NumPy indexing and other simple operations to decide how to process the data in the TF plane. And for viz with things like imshow
or pcolor
you can mask the invalid region.
my current workaround is to average band power over time within each epoch so that only one value per epoch remains. Of course, this means that I completely lose any time information, which is why this is a really crude approach.
You could instead compute the envelope of the Hilbert transform of the band-passed data. This would give you time-varying power in a band. You could even do this step on Raw
and then epoch it directly.
I've seen people doing binning of epochs by length.
Hm. I think the problem is that the t/f map is computed by averaging over epochs. Therefore, filling with NaNs to make all epochs the same length would probably work, but this would be different from what I'd like to achieve. Ideally, each epoch should contain 100% of the process of interest. If I fill the shorter epochs with NaNs, averaging will not average activities of the processes correctly. I think the right way to do this would be to first scale/stretch all epochs to the same length (100%) and then do the averaging.
@agramfort can you point me to a publication where people binned epochs by length?
what do you mean stretch / scale?
no sorry I don't have published work in mind, but do you understand what I mean?
what do you mean stretch / scale?
To compute a t/f map, I need to average over epochs of the same length. If I have variable-length epochs, I need to scale them so that they are the same length. In the audio domain, such algorithms are known as time stretching (see e.g. here). I tried applying such an algorithm to EEG epochs, but it doesn't work since this approach seems to be optimized for audio data. In EEGLAB, it is possible to stretch single epoch t/f representations before averaging, which they call time warping. I'm not sure this is the same as dynamic time warping, which is popular in bioinformatics, but less so in neuroscience.
Again, the idea is that I want to capture the calculation process, which in general varies in length from epoch to epoch. This paper uses the EEGLAB functionality I mentioned to align the heelstrikes to create constant-length epochs.
no sorry I don't have published work in mind, but do you understand what I mean?
Sort of. However, I'm not sure how binning epochs help in computing a t/f map.
if you cannot tell when things happens as it varies I would use a CSP, average the (log)power in CSP "source space" and see if you have a statistical effect in this "feature space"
Hm, not sure I understand how that would work. I'll try playing around a bit with dynamic time warping (e.g. using fastdtw). I think DTW could be applied to each single-epoch spectrum to make them the same length, after which they could be averaged just like normal. I think that's what EEGLAB is doing. If I come up with something useful I'll report back, maybe this is something MNE users would also like.
FWIW I'd love support for TF decomposition of raws, and support for operating on continuous TF data.
But that's a big project that probably few will benefit from.
you mean viz or computation? what is the use case?
A few examples:
- rERSP (time-resolved regression in the time-frequency domain)
- time-frequency analysis with dense spacing of events - maybe you'll end up decomposing every sample 5 times ...
- this thread :)
provide me a nice dataset one could use for demo / education and we'll see how to make this happen.
You convinced me of the necessity of metadata with the kiloword dataset ...
Actually, it was much simpler than I thought. Basically, I resampled all individual TFRs (for each variable-length epoch) to a common length. Then I average these new epochs to create the final TFR plot. Below is a comparison between EEGLAB and MNE for real EEG data. The upper row shows the normal TFRs computed on 5s epochs (after the cue) - 5s was the length of the longest epoch, most epochs are way shorter than that. The bottom row shows the "warped" result, where I only use data from the actual variable-length epochs.
Beside the scale of the data (which isn't the same obviously) and the ugly colormap (which I set because that's what EEGLAB uses) I think the results look pretty good.
Plus, the x axis labels are incorrect, because in the warped results (bottom row), the data from 0s to 5s are actually the warped time points from cue until the response, so it should really be labeled 0% to 100% or something.
I prefer the mne-python results :)
😛 yes but this was not the point. I think I'll prepare a short example, but I need to find some open dataset that features variable length epochs. Any ideas?
no idea on my side
@cbrnr You could pick any dataset that has information on reaction times - 'epoching' from -x seconds wrt the stimulus onset up to x seconds wrt the reaction time will give you variable-length epochs.
E.g. the EEGLAB data.
What was the final outcome of this? I have subjects performing a reach and grasp task, EEG and kinematics collected, and would like to align the EEG per reach and grasp phase for TF analysis.
I think we decided not to integrate this into MNE because it's probably not interesting for most people. However, I have been working on writing up a blog post on how to create these time-warped t/f maps, but I got stuck because I didn't really have good example data. The EEGLAB data does have variable length epochs, but they are very short, so the results do not look very interesting. I'd prefer if I found a better dataset (open access, freely available), but if not I'll probably just make do with what I have. I'll let you know in this issue once I've published the post (might still take a little while to polish everything).
@cbrnr - are you sure about eeglab allowing variable length epochs? If so - this must have changed lately as eeglab always had a requirement of same-length epochs. I was using fieldtrip a bit lately (I'm staying at Donders for some time) and I must say I like how it deals with variable length epochs and how that allows partial rejection (removing only parts of your epochs when artifacts do not span the whole epoch time range - which is really useful for long epochs). I even started thinking back about allowing nans in mne epochs - that would be a big change but would allow both variable-length epochs (although not very memory efficient) and partial rejection.
The EEGLAB example data contains markers that would lead to variable length epochs. What they do in their tutorials is to make them fixed length. So yes, EEGLAB (and MNE) cannot deal with variable-length epochs, hence the need for a custom solution.
Thank you for your donation in MNE! I am recently using MNE to analysing EEG data and attache it with gait events. We often normalize varied step time lenth as 0-100%, it would be great if the vareid time lenth EEG can also be normalized into 0-100%. Thanks, sir! May I know the link for your “blog on how to create these time-warped t/f maps”?
I tried to change epochs into numpy array, and then resample this into same length.
Yes, EEG data locked to gait cycles is a great example where this would be useful. I am currently restructuring my blog at https://cbrnr.github.io/ (I'm switching to Quarto), and after that I will be adding an article on how to compute t/f-maps for variable-length epochs.
It is basically resampling, but you have to resample in the frequency domain and not in the time domain. Resample the time data will change the frequency content, which you want to avoid.
Rather than warping or other techniques, I wound up doing short-time Fourier transform with a window of 500ms around the times of interest for each movement. My code (Matlab) is on the Uni's archive too. Links in the article. https://link.springer.com/article/10.1007/s00221-022-06340-8
Yes, EEG data locked to gait cycles is a great example where this would be useful. I am currently restructuring my blog at https://cbrnr.github.io/ (I'm switching to Quarto), and after that I will be adding an article on how to compute t/f-maps for variable-length epochs.
It is basically resampling, but you have to resample in the frequency domain and not in the time domain. Resample the time data will change the frequency content, which you want to avoid.
Thanks for reminding! Yes, resample should be in the frequency domain. I decided to use the line interpolate to warp the time into fix length epochs with varied time interval.
Then you are already pretty much doing what I'm doing 😄!
Then you are already pretty much doing what I'm doing 😄!
Dear Clemens,
Hope you are doing well! Thanks for your time.
I just got stucked in the step to line interpolate the data. My plan to to divide each step into fixed epochs with a longer length than the normal step. Then I calculte the ERSP using the "mne.time_frequency.EpochsTFR". Next step I want to interplotate the results of EpochsTFR, but I got stucked in this step. I use tfr.average().data to obtain the nd. array of the data and then crop to the taget step length and line interpolate it to a longer fixed length. But I have no idea how to write the results back to the tfr.average().data.
May I ask if you have any comments or suggetions to my work? Maybe the reason is that I am a novice python user. It would be great if you could give me some advice?
Thank you for your help!
You can create a new TFREpochs
object and fill it with the new data. If you have further questions, please ask them in our forum: https://mne.discourse.group/
Thanks for your suggestion! Will do this in the forum.