moabb
moabb copied to clipboard
Problem with benchmark for Nakanishi2015 SSVEP
I am performing a SSVEP benchmark, but I get different results depending if I use 1 or 2 datasets for the dataset Nakanishi2015. If I use two datasets ("Kalunga2016", "Nakanishi2015") for Nakanishi2015 I get an average score 36. If I use only Nakanishi2015 I get 87 which seems to be the correct answer if I look here.
I am using this pipeline https://github.com/NeuroTechX/moabb/blob/develop/pipelines/TSLR-SSVEP.yml in my pipeline folder.
I am using this code:
import os
from moabb import set_download_dir
from moabb import benchmark, set_log_level
pipeline_folder = "your pipeline folder"
results = benchmark(
pipelines=pipeline_folder,
evaluations=["WithinSession"],
paradigms=["FilterBankSSVEP"],
include_datasets=["Kalunga2016", "Nakanishi2015"],
#include_datasets=["Nakanishi2015"],
results="./results/",
overwrite=True,
plot=True,
n_jobs=1, #otherwise memory is not enough, so 4 is a good value
output="./benchmark/",
)
print("Results:")
print(results.to_string())
print("Averaging the session performance:")
print(results.groupby("pipeline").mean("score")[["score", "time"]])
# save results
save_path = os.path.join(
os.path.dirname(os.path.realpath(__file__)), "results_dataframe_test_SSVEP.csv"
)
results.to_csv(save_path, index=True)
print(results.groupby(["dataset","pipeline"]).mean("score")[["score", "time"]].to_string())
With include_datasets=["Kalunga2016", "Nakanishi2015"] I get:
Results:
score time samples subject session channels n_sessions dataset pipeline paradigm evaluation
0 0.687180 0.052597 64.0 1 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
1 0.764103 0.052398 64.0 2 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
2 0.858974 0.052598 64.0 3 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
3 0.733333 0.051998 64.0 4 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
4 0.547436 0.049798 64.0 5 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
5 0.719231 0.051400 64.0 6 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
6 0.853684 0.073397 96.0 7 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
7 0.765385 0.054198 64.0 8 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
8 0.564103 0.051598 64.0 9 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
9 0.594154 0.089997 128.0 10 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
10 0.547436 0.052798 64.0 11 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
11 0.868750 0.124597 160.0 12 0 8 1 Kalunga2016 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
12 0.255556 0.139595 180.0 1 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
13 0.161111 0.141395 180.0 2 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
14 0.427778 0.139996 180.0 3 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
15 0.394444 0.143395 180.0 4 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
16 0.466667 0.148195 180.0 5 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
17 0.350000 0.147795 180.0 6 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
18 0.294444 0.141395 180.0 7 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
19 0.433333 0.148794 180.0 8 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
20 0.511111 0.147396 180.0 9 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
With include_datasets=["Nakanishi2015"] I get:
score time samples subject session channels n_sessions dataset pipeline paradigm evaluation
0 0.711111 4.204810 180.0 1 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 1 0.483333 3.614036 180.0 2 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 2 0.894444 3.866471 180.0 3 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 3 0.950000 3.953306 180.0 4 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 4 0.966667 3.788466 180.0 5 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 5 1.000000 4.709309 180.0 6 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 6 0.955556 3.941562 180.0 7 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 7 0.933333 4.416215 180.0 8 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 8 0.961111 3.723069 180.0 9 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession
I just tested with the latest version of MOABB and pyRiemann from git and I got the same results.
If I change the order of the two datasets like this: include_datasets=["Nakanishi2015", "Kalunga2016"],
I get a lower score for the Kalunga2016, so it looks like when there are 2 datasets then the second in the list gets lower score then it should.
This is extremely strange... Did you use overwrite=True in all these different experiments?
Yes, exactly as in the code above.
Hmmm, this is super strange. We need to meet to discuss this issue, @sylvchev, @carraraig.
It might be related to this issue. The code
def _inc_exc_datasets(datasets, include_datasets, exclude_datasets):
d = list()
if include_datasets is not None:
# Assert if the inputs are key_codes
if isinstance(include_datasets[0], str):
# Map from key_codes to class instances
datasets_codes = [d.code for d in datasets]
# Get the indices of the matching datasets
for incdat in include_datasets:
if incdat in datasets_codes:
d.append(datasets[datasets_codes.index(incdat)])
does not raise an error if the include_datasets contains incorrect codes provided by the user. This is related to https://github.com/NeuroTechX/moabb/issues/654
No. It does not seem to be the source of the main problem related to performance, but user warning for wrong codes will still be useful.