moabb Problem with benchmark for Nakanishi2015 SSVEP

trafficstars

I am performing a SSVEP benchmark, but I get different results depending if I use 1 or 2 datasets for the dataset Nakanishi2015. If I use two datasets ("Kalunga2016", "Nakanishi2015") for Nakanishi2015 I get an average score 36. If I use only Nakanishi2015 I get 87 which seems to be the correct answer if I look here.

I am using this pipeline https://github.com/NeuroTechX/moabb/blob/develop/pipelines/TSLR-SSVEP.yml in my pipeline folder.

I am using this code:

import os
from moabb import set_download_dir
from moabb import benchmark, set_log_level

pipeline_folder = "your pipeline folder"

results = benchmark(
    pipelines=pipeline_folder,
    evaluations=["WithinSession"],
    paradigms=["FilterBankSSVEP"],
    include_datasets=["Kalunga2016", "Nakanishi2015"],
    #include_datasets=["Nakanishi2015"],
    results="./results/",
    overwrite=True,
    plot=True,
    n_jobs=1, #otherwise memory is not enough, so 4 is a good value
    output="./benchmark/",
)

print("Results:")
print(results.to_string())

print("Averaging the session performance:")
print(results.groupby("pipeline").mean("score")[["score", "time"]])

# save results
save_path = os.path.join(
    os.path.dirname(os.path.realpath(__file__)), "results_dataframe_test_SSVEP.csv"
)
results.to_csv(save_path, index=True)

print(results.groupby(["dataset","pipeline"]).mean("score")[["score", "time"]].to_string())

With include_datasets=["Kalunga2016", "Nakanishi2015"] I get:

Results:
       score      time  samples subject session  channels  n_sessions        dataset                pipeline         paradigm     evaluation
0   0.687180  0.052597     64.0       1       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
1   0.764103  0.052398     64.0       2       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
2   0.858974  0.052598     64.0       3       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
3   0.733333  0.051998     64.0       4       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
4   0.547436  0.049798     64.0       5       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
5   0.719231  0.051400     64.0       6       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
6   0.853684  0.073397     96.0       7       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
7   0.765385  0.054198     64.0       8       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
8   0.564103  0.051598     64.0       9       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
9   0.594154  0.089997    128.0      10       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
10  0.547436  0.052798     64.0      11       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
11  0.868750  0.124597    160.0      12       0         8           1    Kalunga2016  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
12  0.255556  0.139595    180.0       1       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
13  0.161111  0.141395    180.0       2       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
14  0.427778  0.139996    180.0       3       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
15  0.394444  0.143395    180.0       4       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
16  0.466667  0.148195    180.0       5       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
17  0.350000  0.147795    180.0       6       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
18  0.294444  0.141395    180.0       7       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
19  0.433333  0.148794    180.0       8       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession
20  0.511111  0.147396    180.0       9       0         8           1  Nakanishi2015  SSVEP Tangent Space LR  FilterBankSSVEP  WithinSession

With include_datasets=["Nakanishi2015"] I get:

  score      time  samples subject session  channels  n_sessions        dataset                pipeline         paradigm     evaluation

0 0.711111 4.204810 180.0 1 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 1 0.483333 3.614036 180.0 2 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 2 0.894444 3.866471 180.0 3 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 3 0.950000 3.953306 180.0 4 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 4 0.966667 3.788466 180.0 5 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 5 1.000000 4.709309 180.0 6 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 6 0.955556 3.941562 180.0 7 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 7 0.933333 4.416215 180.0 8 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession 8 0.961111 3.723069 180.0 9 0 8 1 Nakanishi2015 SSVEP Tangent Space LR FilterBankSSVEP WithinSession

Oct 08 '24 13:10 toncho11

I just tested with the latest version of MOABB and pyRiemann from git and I got the same results.

Oct 08 '24 17:10 toncho11

If I change the order of the two datasets like this: include_datasets=["Nakanishi2015", "Kalunga2016"],

I get a lower score for the Kalunga2016, so it looks like when there are 2 datasets then the second in the list gets lower score then it should.

Oct 08 '24 17:10 toncho11

This is extremely strange... Did you use overwrite=True in all these different experiments?

Oct 09 '24 06:10 PierreGtch

Yes, exactly as in the code above.

Oct 09 '24 06:10 toncho11

Hmmm, this is super strange. We need to meet to discuss this issue, @sylvchev, @carraraig.

Oct 09 '24 14:10 bruAristimunha

It might be related to this issue. The code

def _inc_exc_datasets(datasets, include_datasets, exclude_datasets):
    d = list()
    if include_datasets is not None:
        # Assert if the inputs are key_codes
        if isinstance(include_datasets[0], str):
            # Map from key_codes to class instances
            datasets_codes = [d.code for d in datasets]
            # Get the indices of the matching datasets
            for incdat in include_datasets:
                if incdat in datasets_codes:
                    d.append(datasets[datasets_codes.index(incdat)])

does not raise an error if the include_datasets contains incorrect codes provided by the user. This is related to https://github.com/NeuroTechX/moabb/issues/654

No. It does not seem to be the source of the main problem related to performance, but user warning for wrong codes will still be useful.

Jan 27 '25 14:01 toncho11

moabb moabb copied to clipboard

Problem with benchmark for Nakanishi2015 SSVEP

moabb
moabb copied to clipboard