mne-python Importing Neuroscan Evoked and Epochs-like files

Describe the new feature or enhancement

Adding support for importing .avg and .eeg neuroscan files as evoked and epochs objects.

Describe your proposed implementation

I have written simple scripts using struct to read the raw byte data. I could add these scripts to mne in the form of a mne.read_neuroscan_epochs() function. Here is an example:

def eeg_to_ascii(
        file_name, chanlist='all', triallist='all', typerange='all',
        accepttype='all', rtrange='all', responsetype='all',
        data_format='auto'):
    """This function reads the data from a binary EEG file, extracts and scales the data, and returns it in ASCII format in volts.


    # Check if file ends with .eeg
    if not file_name.endswith('.eeg'):
        raise ValueError("File must be a binary EEG file (.eeg).")

    if not os.path.isfile(file_name):
        raise ValueError(f"File {file_name} not found.")

    with open(file_name, 'rb') as f:
        try:
            # Read general part of the ERP header and set variables
            f.read(20)  # skip revision number
            f.read(342)  # skip the first 362 bytes

            nsweeps = struct.unpack('<H', f.read(2))[0]  # number of sweeps
            f.read(4)  # skip 4 bytes
            # number of points per waveform
            pnts = struct.unpack('<H', f.read(2))[0]
            chan = struct.unpack('<H', f.read(2))[0]  # number of channels
            f.read(4)  # skip 4 bytes
            rate = struct.unpack('<H', f.read(2))[0]  # sample rate (Hz)
            f.read(127)  # skip 127 bytes
            xmin = struct.unpack('<f', f.read(4))[0]  # in s
            xmax = struct.unpack('<f', f.read(4))[0]  # in s
            f.read(387)  # skip 387 bytes

            # Read electrode configuration
            chan_names = []
            baselines = []
            sensitivities = []
            calibs = []
            factors = []
            for elec in range(chan):
                chan_name = f.read(10).decode('ascii').strip('\x00')
                chan_names.append(chan_name)
                f.read(37)  # skip 37 bytes
                baseline = struct.unpack('<H', f.read(2))[0]
                baselines.append(baseline)
                f.read(10)  # skip 10 bytes
                sensitivity = struct.unpack('<f', f.read(4))[0]
                sensitivities.append(sensitivity)
                f.read(8)  # skip 8 bytes
                calib = struct.unpack('<f', f.read(4))[0]
                calibs.append(calib)
                factor = calib * sensitivity / 204.8
                factors.append(factor)

        except struct.error:
            raise ValueError(
                "Error reading binary file. File may be corrupted or not in the expected format.")
        except Exception as e:
            raise ValueError(f"Error reading file: {e}")

    # Read and process epoch datapoints data
    data = np.empty((nsweeps, len(chan_names), pnts), dtype=float)
    sweep_headers = []

    # Constants for the sweep header size in bytes and data point size in bytes
    SWEEP_HEAD_SIZE = 13
    DATA_POINT_SIZE = 4

    with open(file_name, 'rb') as f:
        # Ensure the file pointer is at the beginning of the EEG data
        f.seek((900 + chan * 75))

        for sweep in range(nsweeps):
            # Read the sweep header
            try:
                # f.read(SWEEP_HEAD_SIZE)
                accept = struct.unpack('<c', f.read(1))[0]
                ttype = struct.unpack('<h', f.read(2))[0]
                correct = struct.unpack('<h', f.read(2))[0]
                rt = struct.unpack('<f', f.read(4))[0]
                response = struct.unpack('<h', f.read(2))[0]
                # reserved  struct.unpack('<h', f.read(2))[0]
                f.read(2)  # skip 2 bytes
                sweep_headers.append(
                    (accept, ttype, correct, rt, response, sweep))
            except struct.error:
                raise ValueError(
                    "Error reading sweep header. File may be corrupted or not in the expected format.")
            except Exception as e:
                raise ValueError(f"Error reading sweep header: {e}")

            for point in range(pnts):
                for channel in range(chan):
                    try:
                        # Read the data point as a 4-byte integer
                        value = struct.unpack('<l', f.read(DATA_POINT_SIZE))[0]

                        # Scale the data point to microvolts and store it in the data array
                        data[sweep, channel, point] = value * factors[channel]
                    except struct.error:
                        raise ValueError(
                            "Error reading data points. File may be corrupted or not in the expected format.")
                    except Exception as e:
                        raise ValueError(f"Error reading data points: {e}")

    # Convert data from microvolts to volts
    data = data * 1e-6
    # Return relevant data in ASCII format
    return data, chan_names, rate, xmin, sweep_headers

Describe possible alternatives

I am also working on writing some C++ code to do this as well that mne could make use of.

Ultimately, it may also be simpler to write a separate library myself and just create EpochsArrays from scratch.

Additional context

No response

Jan 17 '24 08:01 withmywoessner

mne.read_evokeds_cnt and mne.read_epochs_cnt or similar seem reasonable to me, we have similar functions for MFF and EEGLAB. It would be great to reuse/refactor as much code from read_raw_cnt as possible. Or is this a different Neuroscan format altogether separate from cnt? If so, what do you use currently to read the raw data, if anything?

Jan 17 '24 16:01 larsoner

I just use the struct library to read the raw binary data @larsoner. According to this site (which mne cites as a reference for cnt.py) Here are how the various formats are structured: I believe the file headers are the same as cnt, but as you can see the actual data is formatted differently for each.

Jan 17 '24 16:01 withmywoessner

Yeah if we can reuse all the header and info setting code then the new epochs and evoked functions can hopefully be very short!

Jan 17 '24 17:01 larsoner

Okay Thanks! Before I start should I work on this in the cnt.py file? If so, should the file be renamed to neuroscan.py. There is also a curry.py which is the newer neuroscan file format. Maybe that should be placed in a neuroscan folder as well.

Jan 17 '24 18:01 withmywoessner

Yes I think cnt.py is the right place. I wouldn't start a new folder, better to stick with our original naming. But it would be good to add a note to read_raw_cnt (and the functions that you add) that it's for reading older neuroscan files. Searching https://mne.tools/dev/generated/mne.io.read_raw_cnt.html for example "neuroscan" doesn't show up at all, I had to figure it out from searching "mne neuroscan" and finding https://mne.tools/stable/auto_tutorials/io/20_reading_eeg_data.html#neuroscan-cnt-cnt (which should also maybe be updated to mention this old/new format stuff).

Jan 17 '24 18:01 larsoner

Hey @larsoner , I don't think Neuroscan stores the event times of epochs with respect to the original data, just a list of epochs and some metadata related to response latencies/event code. Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types

Jan 27 '24 21:01 withmywoessner

Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types

Yes I would just make them up as np.arange(0, len(events)) * np.ceil((tmax - tmin) * sfreq).astype(int) or similar. Or even better check what we do in read_epochs_* functions to see if we similarly allow inventing event times

https://mne.tools/stable/generated/mne.read_epochs_kit.html https://mne.tools/stable/generated/mne.read_epochs_eeglab.html https://mne.tools/stable/generated/mne.read_epochs_fieldtrip.html

I suspect we have no option but to make up times of some sort, so I wouldn't bother making any option to control it (just try to do something reasonable)

Jan 28 '24 03:01 larsoner