moabb icon indicating copy to clipboard operation
moabb copied to clipboard

Split train and validation data with StratifiedKFold

Open Sara04 opened this issue 1 year ago • 1 comments

This PR proposes train validation split via StratifiedKFold to ensure balanced distribution of labels into training and validation subsets. See issue #474 .

Sara04 avatar Sep 18 '23 08:09 Sara04

Results obtained with DL pipelines on the first 5 subjects of PhysionetMotorImagery dataset for the classification into "left_hand", "right_hand", "feet", "hands" with WithinSession evaluation.

With default train valid split (last ~20% of training data used for validation): 0 PhysionetMotorImagery WithinSession Keras_EEGNet_8_2 0.284444 1 PhysionetMotorImagery WithinSession Keras_EEGTCNet 0.235556 2 PhysionetMotorImagery WithinSession Keras_DeepConvNet 0.293333 3 PhysionetMotorImagery WithinSession Keras_EEGITNet 0.280000 4 PhysionetMotorImagery WithinSession Keras_ShallowConvNet 0.466667 5 PhysionetMotorImagery WithinSession Keras_EEGNeX 0.275556

With reordering of the training indices using StratifiedKFold (last ~20% of training data used for validation): 0 PhysionetMotorImagery WithinSession Keras_EEGNet_8_2 0.397778 1 PhysionetMotorImagery WithinSession Keras_EEGTCNet 0.271111 2 PhysionetMotorImagery WithinSession Keras_DeepConvNet 0.293333 3 PhysionetMotorImagery WithinSession Keras_EEGITNet 0.275556 4 PhysionetMotorImagery WithinSession Keras_ShallowConvNet 0.508889 5 PhysionetMotorImagery WithinSession Keras_EEGNeX 0.228889

Sara04 avatar Sep 18 '23 09:09 Sara04