mne-python icon indicating copy to clipboard operation
mne-python copied to clipboard

Problem with visual_92_categories dataset

Open hoechenberger opened this issue 2 years ago • 4 comments

Reported at https://mne.discourse.group/t/mne-visual-92-categories-data-sample-subject-0-raw-1-fif-does-not-exist/5204

The dataset consists of 4 split files, but they've been manually renamed.

To fix: Each file should be read individually, and on_split_missing='warn' should be passed. Then the file should be re-saved without it getting split. So we're basically manually splitting the files without "linking" them, as is expected by the Representational Similarity Analysis example.

Alternatative: Actually the decoding for all runs finishes in basically no time on my computer, so there's no need to do this kind of hocus-pocus in the Representational Similarity Analysis example where it allows loading of only a sub-set of runs. Instead, always all runs should be processed. This would then simplify fixing of the dataset: rename the existing files to match the expected split pattern, potentially load & save under a new name, where now MNE ensures that the split naming is done correctly.

hoechenberger avatar Jul 02 '22 19:07 hoechenberger

IIRC the problem was that the raw file ended up being huge, so hosting it (and people reliably downloading it) were going to be tricky

larsoner avatar Jul 05 '22 13:07 larsoner

the problem was that the raw file ended up being huge

if it's recreated still as a split file but with proper naming, this won't be an issue though right?

drammock avatar Jul 05 '22 13:07 drammock

By "the raw file" I should have said "the set of raw files". I thought it was 8 or 10 GB worth of raw data or so, but I could be wrong...

larsoner avatar Jul 05 '22 13:07 larsoner

it's 6ish GB...

Screenshot_2022-07-05_09-10-27

If we fix this we should also get rid of the unnecessary "dotfiles".

My vote is for keeping them split, and fixing the naming so that they actually load as split files, as long as that doesn't have a huge impact on the tutorial run time.

drammock avatar Jul 05 '22 14:07 drammock