Importing Neuroscan Evoked and Epochs-like files
Describe the new feature or enhancement
Adding support for importing .avg and .eeg neuroscan files as evoked and epochs objects.
Describe your proposed implementation
I have written simple scripts using struct to read the raw byte data. I could add these scripts to mne in the form of a mne.read_neuroscan_epochs() function. Here is an example:
def eeg_to_ascii(
file_name, chanlist='all', triallist='all', typerange='all',
accepttype='all', rtrange='all', responsetype='all',
data_format='auto'):
"""This function reads the data from a binary EEG file, extracts and scales the data, and returns it in ASCII format in volts.
# Check if file ends with .eeg
if not file_name.endswith('.eeg'):
raise ValueError("File must be a binary EEG file (.eeg).")
if not os.path.isfile(file_name):
raise ValueError(f"File {file_name} not found.")
with open(file_name, 'rb') as f:
try:
# Read general part of the ERP header and set variables
f.read(20) # skip revision number
f.read(342) # skip the first 362 bytes
nsweeps = struct.unpack('<H', f.read(2))[0] # number of sweeps
f.read(4) # skip 4 bytes
# number of points per waveform
pnts = struct.unpack('<H', f.read(2))[0]
chan = struct.unpack('<H', f.read(2))[0] # number of channels
f.read(4) # skip 4 bytes
rate = struct.unpack('<H', f.read(2))[0] # sample rate (Hz)
f.read(127) # skip 127 bytes
xmin = struct.unpack('<f', f.read(4))[0] # in s
xmax = struct.unpack('<f', f.read(4))[0] # in s
f.read(387) # skip 387 bytes
# Read electrode configuration
chan_names = []
baselines = []
sensitivities = []
calibs = []
factors = []
for elec in range(chan):
chan_name = f.read(10).decode('ascii').strip('\x00')
chan_names.append(chan_name)
f.read(37) # skip 37 bytes
baseline = struct.unpack('<H', f.read(2))[0]
baselines.append(baseline)
f.read(10) # skip 10 bytes
sensitivity = struct.unpack('<f', f.read(4))[0]
sensitivities.append(sensitivity)
f.read(8) # skip 8 bytes
calib = struct.unpack('<f', f.read(4))[0]
calibs.append(calib)
factor = calib * sensitivity / 204.8
factors.append(factor)
except struct.error:
raise ValueError(
"Error reading binary file. File may be corrupted or not in the expected format.")
except Exception as e:
raise ValueError(f"Error reading file: {e}")
# Read and process epoch datapoints data
data = np.empty((nsweeps, len(chan_names), pnts), dtype=float)
sweep_headers = []
# Constants for the sweep header size in bytes and data point size in bytes
SWEEP_HEAD_SIZE = 13
DATA_POINT_SIZE = 4
with open(file_name, 'rb') as f:
# Ensure the file pointer is at the beginning of the EEG data
f.seek((900 + chan * 75))
for sweep in range(nsweeps):
# Read the sweep header
try:
# f.read(SWEEP_HEAD_SIZE)
accept = struct.unpack('<c', f.read(1))[0]
ttype = struct.unpack('<h', f.read(2))[0]
correct = struct.unpack('<h', f.read(2))[0]
rt = struct.unpack('<f', f.read(4))[0]
response = struct.unpack('<h', f.read(2))[0]
# reserved struct.unpack('<h', f.read(2))[0]
f.read(2) # skip 2 bytes
sweep_headers.append(
(accept, ttype, correct, rt, response, sweep))
except struct.error:
raise ValueError(
"Error reading sweep header. File may be corrupted or not in the expected format.")
except Exception as e:
raise ValueError(f"Error reading sweep header: {e}")
for point in range(pnts):
for channel in range(chan):
try:
# Read the data point as a 4-byte integer
value = struct.unpack('<l', f.read(DATA_POINT_SIZE))[0]
# Scale the data point to microvolts and store it in the data array
data[sweep, channel, point] = value * factors[channel]
except struct.error:
raise ValueError(
"Error reading data points. File may be corrupted or not in the expected format.")
except Exception as e:
raise ValueError(f"Error reading data points: {e}")
# Convert data from microvolts to volts
data = data * 1e-6
# Return relevant data in ASCII format
return data, chan_names, rate, xmin, sweep_headers
Describe possible alternatives
I am also working on writing some C++ code to do this as well that mne could make use of.
Ultimately, it may also be simpler to write a separate library myself and just create EpochsArrays from scratch.
Additional context
No response
mne.read_evokeds_cnt and mne.read_epochs_cnt or similar seem reasonable to me, we have similar functions for MFF and EEGLAB. It would be great to reuse/refactor as much code from read_raw_cnt as possible. Or is this a different Neuroscan format altogether separate from cnt? If so, what do you use currently to read the raw data, if anything?
I just use the struct library to read the raw binary data @larsoner. According to this site (which mne cites as a reference for cnt.py) Here are how the various formats are structured:
I believe the file headers are the same as cnt, but as you can see the actual data is formatted differently for each.
Yeah if we can reuse all the header and info setting code then the new epochs and evoked functions can hopefully be very short!
Okay Thanks! Before I start should I work on this in the cnt.py file? If so, should the file be renamed to neuroscan.py. There is also a curry.py which is the newer neuroscan file format. Maybe that should be placed in a neuroscan folder as well.
Yes I think cnt.py is the right place. I wouldn't start a new folder, better to stick with our original naming. But it would be good to add a note to read_raw_cnt (and the functions that you add) that it's for reading older neuroscan files. Searching https://mne.tools/dev/generated/mne.io.read_raw_cnt.html for example "neuroscan" doesn't show up at all, I had to figure it out from searching "mne neuroscan" and finding https://mne.tools/stable/auto_tutorials/io/20_reading_eeg_data.html#neuroscan-cnt-cnt (which should also maybe be updated to mention this old/new format stuff).
Hey @larsoner , I don't think Neuroscan stores the event times of epochs with respect to the original data, just a list of epochs and some metadata related to response latencies/event code. Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types
Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types
Yes I would just make them up as np.arange(0, len(events)) * np.ceil((tmax - tmin) * sfreq).astype(int) or similar. Or even better check what we do in read_epochs_* functions to see if we similarly allow inventing event times
https://mne.tools/stable/generated/mne.read_epochs_kit.html https://mne.tools/stable/generated/mne.read_epochs_eeglab.html https://mne.tools/stable/generated/mne.read_epochs_fieldtrip.html
I suspect we have no option but to make up times of some sort, so I wouldn't bother making any option to control it (just try to do something reasonable)