moabb icon indicating copy to clipboard operation
moabb copied to clipboard

[No Code] Discover new datasets

Open alexandrebarachant opened this issue 8 years ago • 38 comments

We need people browsing the web to discover interesting datasets than could be added to the moabb.

You can comment on this issue.

But first, check your dataset is not already in the list

What kind of datasets

We are interested in any datasets of time neural timeseries (EEG, MEG, ECOG, and fNIRS) with a minimum of 5 subjects, where we can apply machine learning algorithms and available online. It does not need to be a BCI dataset, but it must contains different condition/task, labelled and tagged.

How do I search for a new dataset ?

Many of the datasets of the BNCI index have not been reported. you can start here.

Researcher are making more and more datasets available. some database exists and might contains interesting things :

Finally, google is your friend

How much time does it takes ?

Entering a new dataset should took you 2 minutes.

alexandrebarachant avatar Jun 03 '17 09:06 alexandrebarachant

found one here: https://depositonce.tu-berlin.de/handle/11303/6271

vinay-jayaram avatar Mar 20 '18 13:03 vinay-jayaram

Browsing Plos one to find New motor imagery datasets:

  • [x] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0114853
    • 10 subjects
    • (left hand, right hand, feet) + (both hands, left hand combined with right foot, right hand combined with left foot)
  • [ ] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0188293
    • 12 subjects
    • 9 different elbow task + rest
  • [ ] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0162546
    • 13 subjects
    • left in Execution / Imagination / Observation. can use rest as a second class
  • [ ] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0121262
    • 30 subjects
    • left versus rest
  • [x] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0162657
    • 4 subjects, 3 sessions each
    • left-hand, right-hand, feet
  • [ ] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0193722
    • 14 subjects
    • left hand, right hand, rest. pre-epoched data :(
  • [ ] ~http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0143962~ (EEG data not available)
    • 18 subjects, 6 sessions
    • 3 MI tasks
    • can't find the data, but they are suposed to be available
  • [ ] http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0174161
    • 12 subjects
    • Grasp, Elbow, rest

alexandrebarachant avatar Apr 01 '18 20:04 alexandrebarachant

re: dataset 1, All the trials are pre-epoched :(

vinay-jayaram avatar Apr 04 '18 15:04 vinay-jayaram

yeah. Actually i think the GigaDb dataset is already like this ...

alexandrebarachant avatar Apr 04 '18 15:04 alexandrebarachant

ah you're right, you just concatenated all the trials. In that case we can do the same here :) good good

vinay-jayaram avatar Apr 04 '18 16:04 vinay-jayaram

also regarding the second to last: Have you e-mailed Fabien?

vinay-jayaram avatar Apr 04 '18 16:04 vinay-jayaram

I'm definitely not super happy about the concatenation of individual trials. in the case of the GigaDB, the dataset was too large to ignore. in those case, we can contact the authors to ask them about the raw data, but concatenating is a good starting point to see whether the dataset is really interesting or not.

Also, let's contact fabien and camille about the second last dataset. I will do it today.

alexandrebarachant avatar Apr 04 '18 16:04 alexandrebarachant

regarding concatenation though: Couldn't we just add a buffer of zeros before and after each trial to smooth out border effects? After de-meaning the trials to eliminate the issue of offset

vinay-jayaram avatar Apr 05 '18 09:04 vinay-jayaram

Yep we could. I think the most problematic part is the non zero mean that create huge edge artifact. We could also return Mne epochs in that case, but that still not ideal from a filtering point of view.

In any case we might want to put a warning ?

alexandrebarachant avatar Apr 05 '18 12:04 alexandrebarachant

warning is good, will add

vinay-jayaram avatar Apr 05 '18 13:04 vinay-jayaram

the list Is not synchronized whit the documentation, why? can I help there?

Seburath avatar Feb 18 '19 22:02 Seburath

We will use this issue and the associated wiki page to keep track of the dataset that we could add in MOABB. Please, comment this issue if you want to report about a new dataset.

sylvchev avatar Jun 10 '21 10:06 sylvchev

There is a nice dataset here for SSVEP and ERP using EEG and ear-EEG while standing or moving, the data are available here

sylvchev avatar Dec 23 '21 13:12 sylvchev

Many datasets are listed here : https://www.researchgate.net/post/Are-there-any-public-EEG-data-sets-that-one-can-try-their-hands-on

sylvchev avatar Jun 01 '22 14:06 sylvchev

This dataset is interesting for its population age and size, it is based on SSVEP for 100 participants with ages greater than 50 years old: https://www.nature.com/articles/s41597-022-01372-9

okbalefthanded avatar Feb 09 '23 11:02 okbalefthanded

This dataset could be integrated in MOABB, MI with information about subjects: https://zenodo.org/record/7554429

sylvchev avatar Jun 20 '23 09:06 sylvchev

Ideas for EEG datasets: https://www.fieldtriptoolbox.org/faq/open_data/

taziksh avatar Jan 16 '24 17:01 taziksh

okbalefthanded avatar Jan 31 '24 09:01 okbalefthanded

This dataset "Inner Speech Dataset" was published in nature and seems like a good fit to add support. Paper: Thinking out loud, an open-access EEG-based BCI dataset for inner speech recognition Data: OpenNeuro link

vmcru avatar Feb 06 '24 21:02 vmcru

These are 2 other interesting ones someone pointed out on the Slack channel -

  1. Continuous sensorimotor rhythm based brain computer interface learning in a large population - Data
  2. A large electroencephalographic motor imagery dataset for electroencephalographic brain computer interfaces - Data

Hi @Div12345, I was interested in the second dataset, but unfortunately, I did not find it in the MOABB documentation. Are there any plans related to adding the second dataset in the near future, or is the dataset already part of the library under some specific section or with a specific name?

HarlockOfficial avatar Feb 12 '24 14:02 HarlockOfficial

All the dataset inside this paper: https://arxiv.org/pdf/2402.08656.pdf

bruAristimunha avatar Feb 28 '24 13:02 bruAristimunha

okbalefthanded avatar Feb 29 '24 08:02 okbalefthanded

https://www.frontiersin.org/articles/10.3389/fnhum.2023.1134869/full

bruAristimunha avatar Mar 10 '24 22:03 bruAristimunha

Is someone working on BEnchmark database Towards BCI Application (https://www.frontiersin.org/journals/neuroscience/articles/10.3389/fnins.2020.00627/full)?

It is an SSVEP dataset with 70 subjects performing a 40-target cued-spelling task. I saw that it was referred on the Datasets to include section, but found no Issue related to it.

machinelatto avatar Apr 25 '24 02:04 machinelatto

Hi @machinelatto!

It seems like no one focused on this task, or if someone started, didn't commit or create the PR. Would you be interested?

You would basically need to create two functions, as shown in this tutorial: https://neurotechx.github.io/moabb/auto_tutorials/4_adding_a_dataset.html#sphx-glr-auto-tutorials-4-adding-a-dataset-py

One is to download and one is to load the dataset using mne.

bruAristimunha avatar Apr 25 '24 16:04 bruAristimunha

Hi @bruAristimunha !

I'm probably going to use this dataset on my research, so I could try to create those functions in the next weeks. If it goes well I'l open a PR.

machinelatto avatar Apr 27 '24 21:04 machinelatto