clean_rawdata
clean_rawdata copied to clipboard
Different results with repeated runs of clean_rawdata
Hi there,
I just followed up on this thread (https://sccn.ucsd.edu/pipermail/eeglablist/2020/015456.html) I found on the eeglablist about inconsistent results when using clean_artifatcs for bad channel detection and removal. I encounter the same problem. After running quite a few tests with a couple of different datasets and different parameters I think I observed the following. Results are inconsistent when:
- The recording is short (less than 10 minutes recorded at 1000Hz before downsampling at 250Hz). So, in general, when the data has less than 150.000 samples. The less the number of samples, the more inconsistent the results are.
- The cut-off for the low-pass filter is 0.1 (as used in ERP research - I did not test with lower than this).
In my testing, these two conditions needed to be met in order to find inconsistent results. Furthermore, they were related, as the lower the number of samples, the higher the cut-off necessary to create inconsistent results. I think the problem arises from the use of rand() in clean_channels called in the background. Setting a seed create stable results, but it is difficult to tell which channels are actually bad.
I do understand this is a problem that not many people might run into given these conditions but it could be helpful to have a warning appearing (or simply being displayed on the console) in case someone is trying to process a short dataset
Proposed solution
Add a warning when calling clean_artifacts or pop_clean_rawdata (as it uses the former) if the length of EEG.data is shorter than N (I would say 150.000, that is 10minutes sampled/downsampled at 250Hz). A simple implementation would be:
if length(EEG.data) < 150000
warning('clean_artifacts: The dataset is rather short. It is possible to encounter inconsistent results for the detection of bad channels. Try a few times to verify this. ')
end
This for both the function (maybe in pop_clean_rawdata) the message could be displayed in a window (as it is likely that people are using the GUI in that case).
Thank you,
Daniele
We will look into it. As of recent revision, the results should be identical.
Hi @arnodelorme, following the eeglablist thread, here is a detailed account of what I tried and observed.
First observation
First, I tried to run the following pipeline (load data, add channel location, downsample to 250Hz, high-pass filter with Basic fir-filter new at 0.1Hz, low-pass filter with the same filter at 30Hz, run clean_rawdata). I added the code below. Note that I am using an EGI system, thus I am loading the corresponding channel locations, declaring Cz as the reference, removing the fiducial channels (step 2).
If I re-run the entire pipeline, then I observe inconsistent results in the channels detected as noisy. As described above, I think this is due to the length of the data. I tried with a longer dataset, by trimming it down at different lengths. This is how I came up with the threshold of 150000 sample points. Sadly, I am not the owner of the long dataset, so I cannot share it as this was not covered in the ethics when the experiment was run. However, I will share the shorter dataset. I will record a longer dataset in the following days, so I will test that and share it if necessary.
%% 1. Start EEGLAB and load unprocessed dataset (Common to both iterations)
[ALLEEG EEG CURRENTSET ALLCOM] = eeglab;
EEG = pop_loadset('filename',injectedFile,'filepath',injectedDir);
%% 2. Add channel location
EEG = pop_chanedit(EEG, 'load',{GSNchanloc,'filetype','sfp'}, ...
'changefield',{132,'datachan',0},'changefield',{132,'datachan',1},'changefield',{132,'datachan',0}); % Remove Cz from channel array (along with fiducial channels)
%% 3. Downsample
EEG = pop_resample(EEG, dwnSampFreq);
%% 4a. Highpass filter
EEG = pop_eegfiltnew(EEG, 'locutoff', 0.1, 'plotfreqz', 1); %0.1 high-pass filter on original data
EEG = pop_eegfiltnew(EEG, 'hicutoff',30,'plotfreqz',1); %30Hz low-pass filter on original data
%% Detecting Bad Channels
% Initialise dataset to use for channel removal - copy of EEG
EEGremoved = EEG;
% Remove channels from copy of current dataset (copyData or EEG)
[EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4, ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian');
% Obtain removed channels IDs and their total number
removedChannelIds = find(removedChannels == 1);
removedChannelNum = length(removedChannelIds);
Second observation
I was following the tests you uploaded here So, mimicking that structure, I tried to run multiple tests in a loop. ***Surprisingly, if I run clean_rawdata on the same dataset without loading it each time, then the results are stable. ***The code is as above, however, the last part is changed to the following loop:
%Store results
removed = struct('Nremoved',[],'Idx',[] );
for n=1:10
% Create copy of dataset
EEGremoved = EEG;
% Remove channels from copy of current dataset (copyData or EEG)
[EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4, ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian');
% Obtain removed channels IDs and their total number
removedChannelIds = find(removedChannels == 1);
removedChannelNum = length(removedChannelIds);
removed(n).Nremoved = removedChannelNum;
removed(n).Idx = removedChannelIds;
end
Conclusions
Unless I am doing something wrong (please forgive me if this is the case), the problem arises when the dataset is loaded and processed but not when the function is called on a pre-loaded dataset.
Attachments
I have attached the shorter dataset with the correct channel locations, but otherwise unprocessed. The file is available here
Hope this makes sense,
Thank you,
Daniele
Will try it out. Makoto can you confirm the problem as well? Cheers,
Arno
On Mar 9, 2022, at 3:38 PM, d-scanzi @.***> wrote:
Hi @arnodelorme, following the eeglablist thread, here is a detailed account of what I tried and observed.
First observation
First, I tried to run the following pipeline (load data, add channel location, downsample to 250Hz, high-pass filter with Basic fir-filter new at 0.1Hz, low-pass filter with the same filter at 30Hz, run clean_rawdata). I added the code below. Note that I am using an EGI system, thus I am loading the corresponding channel locations, declaring Cz as the reference, removing the fiducial channels (step 2).
If I re-run the entire pipeline, then I observe inconsistent results in the channels detected as noisy. As described above, I think this is due to the length of the data. I tried with a longer dataset, by trimming it down at different lengths. This is how I came up with the threshold of 150000 sample points. Sadly, I am not the owner of the long dataset, so I cannot share it as this was not covered in the ethics when the experiment was run. However, I will share the shorter dataset. I will record a longer dataset in the following days, so I will test that and share it if necessary.
%% 1. Start EEGLAB and load unprocessed dataset (Common to both iterations)
[ ALLEEG EEG CURRENTSET ALLCOM] = eeglab ; EEG = pop_loadset('filename',injectedFile,'filepath',injectedDir );
%% 2. Add channel location
EEG
= pop_chanedit(EEG, 'load',{GSNchanloc,'filetype','sfp' }, ...'changefield',{132,'datachan',0},'changefield',{132,'datachan',1},'changefield',{132,'datachan',0}); % Remove Cz from channel array (along with fiducial channels)
%% 3. Downsample
EEG
= pop_resample(EEG, dwnSampFreq );%% 4a. Highpass filter
EEG
= pop_eegfiltnew(EEG, 'locutoff', 0.1, 'plotfreqz', 1); %0.1 high-pass filter on original dataEEG
= pop_eegfiltnew(EEG, 'hicutoff',30,'plotfreqz',1); %30Hz low-pass filter on original data%% Detecting Bad Channels % Initialise dataset to use for channel removal - copy of EEG
EEGremoved
= EEG ;% Remove channels from copy of current dataset (copyData or EEG)
[ EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4 , ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian' );
% Obtain removed channels IDs and their total number
removedChannelIds = find(removedChannels == 1 ); removedChannelNum = length(removedChannelIds); Second observation
I was following the tests you uploaded here So, mimicking that structure, I tried to run multiple tests in a loop. ***Surprisingly, if I run clean_rawdata on the same dataset without loading it each time, then the results are stable. ***The code is as above, however, the last part is changed to the following loop:
%Store results
removed = struct('Nremoved',[],'Idx' ,[] );
for n=1:10
% Create copy of dataset
EEGremoved
= EEG ;
% Remove channels from copy of current dataset (copyData or EEG)
[
EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4 , ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian' );
% Obtain removed channels IDs and their total number
removedChannelIds
= find(removedChannels == 1 ); removedChannelNum = length(removedChannelIds );
removed(n).Nremoved = removedChannelNum ;
removed(n).Idx = removedChannelIds ;
end Conclusions
Unless I am doing something wrong (please forgive me if this is the case), the problem arises when the dataset is loaded and processed but not when the function is called on a pre-loaded dataset.
Attachments
I have attached the shorter dataset with the correct channel locations, but otherwise unprocessed. The file is available here
Hope this makes sense,
Thank you,
Daniele
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.
I'll need time to get back to this issue, sorry.
Makoto
On Wed, Mar 9, 2022 at 3:42 PM Arnaud Delorme @.***> wrote:
Will try it out. Makoto can you confirm the problem as well? Cheers,
Arno
On Mar 9, 2022, at 3:38 PM, d-scanzi @.***> wrote:
Hi @arnodelorme, following the eeglablist thread, here is a detailed account of what I tried and observed.
First observation
First, I tried to run the following pipeline (load data, add channel location, downsample to 250Hz, high-pass filter with Basic fir-filter new at 0.1Hz, low-pass filter with the same filter at 30Hz, run clean_rawdata). I added the code below. Note that I am using an EGI system, thus I am loading the corresponding channel locations, declaring Cz as the reference, removing the fiducial channels (step 2).
If I re-run the entire pipeline, then I observe inconsistent results in the channels detected as noisy. As described above, I think this is due to the length of the data. I tried with a longer dataset, by trimming it down at different lengths. This is how I came up with the threshold of 150000 sample points. Sadly, I am not the owner of the long dataset, so I cannot share it as this was not covered in the ethics when the experiment was run. However, I will share the shorter dataset. I will record a longer dataset in the following days, so I will test that and share it if necessary.
%% 1. Start EEGLAB and load unprocessed dataset (Common to both iterations)
[ ALLEEG EEG CURRENTSET ALLCOM] = eeglab ; EEG = pop_loadset('filename',injectedFile,'filepath',injectedDir );
%% 2. Add channel location
EEG = pop_chanedit(EEG, 'load',{GSNchanloc,'filetype','sfp' }, ...
'changefield',{132,'datachan',0},'changefield',{132,'datachan',1},'changefield',{132,'datachan',0}); % Remove Cz from channel array (along with fiducial channels)
%% 3. Downsample
EEG = pop_resample(EEG, dwnSampFreq );
%% 4a. Highpass filter
EEG = pop_eegfiltnew(EEG, 'locutoff', 0.1, 'plotfreqz', 1); %0.1 high-pass filter on original data
EEG = pop_eegfiltnew(EEG, 'hicutoff',30,'plotfreqz',1); %30Hz low-pass filter on original data
%% Detecting Bad Channels % Initialise dataset to use for channel removal - copy of EEG
EEGremoved = EEG ;
% Remove channels from copy of current dataset (copyData or EEG)
[ EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4 , ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian'
);
% Obtain removed channels IDs and their total number
removedChannelIds = find(removedChannels == 1 ); removedChannelNum = length(removedChannelIds); Second observation
I was following the tests you uploaded here So, mimicking that structure, I tried to run multiple tests in a loop. ***Surprisingly, if I run clean_rawdata on the same dataset without loading it each time, then the results are stable. ***The code is as above, however, the last part is changed to the following loop:
%Store results
removed = struct('Nremoved',[],'Idx' ,[] );
for n=1:10
% Create copy of dataset
EEGremoved
= EEG ;
% Remove channels from copy of current dataset (copyData or EEG)
[
EEGremoved,~,~,removedChannels] = clean_artifacts(EEGremoved, 'FlatlineCriterion',5,'ChannelCriterion',0.75,'LineNoiseCriterion',4 , ...
'Highpass','off','BurstCriterion','off','WindowCriterion','off','BurstRejection','off','Distance','Euclidian'
);
% Obtain removed channels IDs and their total number
removedChannelIds
= find(removedChannels == 1 ); removedChannelNum = length(removedChannelIds );
removed(n).Nremoved = removedChannelNum ;
removed(n).Idx = removedChannelIds ;
end Conclusions
Unless I am doing something wrong (please forgive me if this is the case), the problem arises when the dataset is loaded and processed but not when the function is called on a pre-loaded dataset.
Attachments
I have attached the shorter dataset with the correct channel locations, but otherwise unprocessed. The file is available here
Hope this makes sense,
Thank you,
Daniele
— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.