mnelab XDF reading: merging streams via resampling impacts channels that contain NaN

Tobii eyetracker devices may produce time series data that contain NaN values (for example pupil dilation when the eye is closed)
When streaming several Tobii eyetracker streams to LSL and recording in XDF, one may eventually want to merge the separate eyetracker streams into one MNE raw object
In theory, this is possible with mnelab.io.xdf.read_raw_xdf
however, the presence of NaN values in some channels of the streams result in the entire channel being NaN

I suspect this is due to the resampling done here:

https://github.com/cbrnr/mnelab/blob/4bbadaaa7e19d2a093c73c0c5afdc7110a5743c4/src/mnelab/io/xdf.py#L149

See this proof of principle:

import numpy as np
import scipy.signal

x = np.arange(100) * 1.0
x[1] = np.nan
y = scipy.signal.resample(x, 90, axis=0)
assert np.isnan(y).all()  # this passes

I don't really have an idea how to prevent this. Would masking NaNs with "placeholder values" prior to resampling be an option? and then converting them back to NaN after resampling?

Jan 07 '25 15:01 sappelhoff

Maybe some domain-specific preprocessing could be an option? Would using the last value instead of NaN make sense? I think some sort of interpolation will be necessary before resampling.

Jan 07 '25 16:01 cbrnr

Maybe some domain-specific preprocessing could be an option?

Yes, a current solution is to read each stream separately into mne Raw objects (via mnelab), then do the processing, then resample and crop the raw objects, and finally call raw.add_channels() to concatenate the objects

Perhaps this is the only sensible way to avoid unnecessarily blowing up the reading code here with an edge case 🤔

Jan 08 '25 10:01 sappelhoff

I mean, if this is a common operation for eyetracking streams it might be worth including it as an option. If it's really an edge case, not so much. Can you provide more details on how you actually "do the processing" for these kinds of signals? How do you get rid of the NaNs?

Jan 08 '25 10:01 cbrnr

I mean, if this is a common operation for eyetracking streams it might be worth including it as an option.

Yes, unlike in EEG or other time series, NaN values are to be expected in eye tracking, because when the tracker loses the eyes (e.g., due to a blink), a NaN value gets assigned.

If it's really an edge case, not so much.

I thought the edge case here is probably having two separate eye-tracking streams that need to be merged. Most people will only have one eye-tracking stream, and will thus only have to resample if they want to combine it with another stream.

But then again, many users MIGHT want to combine their eye-tracking and EEG streams. So perhaps this is all just a valid case.

Can you provide more details on how you actually "do the processing" for these kinds of signals? How do you get rid of the NaNs?

For now, I want to keep the NaNs in my mne raw object, because I eventually want to interpolate these values with dedicated functions, see also:

https://github.com/mne-tools/mne-python/pull/12946

I am currently:

reading each stream separately (no resampling)
checking if they are approximately aligned (same signal length)
- crop the longer signal to the length of the shorter one if there is no alignment (usually there is only a few samples of misalignment, which is not ideal but maybe OK)
call add_channels to combine the recordings

☝️ I am not happy with this and will need to revise this in the future.

Jan 08 '25 12:01 sappelhoff