moabb
moabb copied to clipboard
Split train and validation data with StratifiedKFold
This PR proposes train validation split via StratifiedKFold to ensure balanced distribution of labels into training and validation subsets. See issue #474 .
Results obtained with DL pipelines on the first 5 subjects of PhysionetMotorImagery dataset for the classification into "left_hand", "right_hand", "feet", "hands" with WithinSession evaluation.
With default train valid split (last ~20% of training data used for validation): 0 PhysionetMotorImagery WithinSession Keras_EEGNet_8_2 0.284444 1 PhysionetMotorImagery WithinSession Keras_EEGTCNet 0.235556 2 PhysionetMotorImagery WithinSession Keras_DeepConvNet 0.293333 3 PhysionetMotorImagery WithinSession Keras_EEGITNet 0.280000 4 PhysionetMotorImagery WithinSession Keras_ShallowConvNet 0.466667 5 PhysionetMotorImagery WithinSession Keras_EEGNeX 0.275556
With reordering of the training indices using StratifiedKFold (last ~20% of training data used for validation): 0 PhysionetMotorImagery WithinSession Keras_EEGNet_8_2 0.397778 1 PhysionetMotorImagery WithinSession Keras_EEGTCNet 0.271111 2 PhysionetMotorImagery WithinSession Keras_DeepConvNet 0.293333 3 PhysionetMotorImagery WithinSession Keras_EEGITNet 0.275556 4 PhysionetMotorImagery WithinSession Keras_ShallowConvNet 0.508889 5 PhysionetMotorImagery WithinSession Keras_EEGNeX 0.228889